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ALPHA- 1,4-GLUC AN LYASE FROM A FUNGUS, ITS PURIFICATION GENE CLONING AND 
EXPRESSION IN MICROORGANISMS 

The present invention relates to an enzyme, in particular a-l,4-glucan lyase ( M GL"). 
The present invention also relates to a method of extracting same. 

FR-A-2617502 and Baute et al in Phytochemistry [1988] vol. 27 No. 1 1 pp3401-3403 
report on the production of 1 ,5-D-anhydrofructose ("AF") in Morchella vulgaris by 
an apparent enzymatic reaction. The yield of production of AF is quite low. Despite 
a reference to a possible enzymatic reaction, neither of these two documents presents 
any amino acid sequence data for any enzyme let alone any nucleotide sequence 
information. These documents say that AF can be a precursor for the preparation of 
the antibiotic pyrone microthecin. 

Yu et al in Biochimica et Biophysica Acta [1993] vol 1156 pp3 13-320 report on the 
preparation of GL from red seaweed and its use to degrade a-l,4-glucan to produce 
AF. The yield of production of AF is quite low. Despite a reference to the enzyme 
GL this document does not present any amino acid sequence data for that enzyme let 
alone any nucleotide sequence information coding for the same. This document also 
suggests that the source of GL is just algal. 

According to the present invention there is provided a method of preparing the 
enzyme cr-l,4-glucan lyase comprising isolating the enzyme from a culture of a 
fungus wherein the culture is substantially free of any other organism. 

Preferably the enzyme is isolated and/or further purified using a gel that is not 
degraded by the enzyme. 

Preferably the gel is based on dextrin or derivatives thereof, preferably a 
cyclodextrin, more preferably beta-cyclodextrin. 

According to the present invention there is also provided a GL enzyme prepared by 
the method of the present invention. 
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Preferably the fungus is Morchella costata or Morchella vulgaris. 

Preferably the enzyme comprises the amino acid sequence SEQ. ID. No. 1 or SEQ. 
I.D. No. 2, or any variant thereof. 

5 

The term "any variant thereof ' means any substitution of, variation of, modification 
of, replacement of, deletion of or addition of an amino acid from or to the sequence 
providing the resultant enzyme has lyase activity. 

10 According to the present invention there is also provided a nucleotide sequence coding 
for the enzyme a-l,4-glucan lyase, preferably wherein the sequence is not in its 
natural enviroment (i.e. it does not form part of the natural genome of a cellular 
organism expressing the enzyme). 

15 Preferably the nucleotide sequence is a DNA sequence. 

Preferably the DNA sequence comprises a sequence that is the same as, or is 
complementary to, or has substantial homology with, or contains any suitable codon 
substitution(s) for any of those of, SEQ. ID. No. 3 or SEQ. ID. No. 4. 

20 

The expression "substantial homology" covers homology with respect to structure 
and/or nucleotide components and/or biological activity. 

The expression "contains any suitable codon substitutions" covers any codon 
25 replacement or substitution with another codon coding for the same amino acid or any 

addition or removal thereof providing the resultant enzyme has lyase activity. 

In other words, the present invention also covers a modified DNA sequence in which 
at least one nucleotide has been deleted, substituted or modified or in which at least 
30 one additional nucleotide has been inserted so as to encode a polypeptide having the 

activity of a glucan lyase, preferably an enzyme having an increased lyase activity. 
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According to the present invention there is also provided a method of preparing the 
enzyme a-l,4-glucan lyase comprising expressing the nucleotide sequence of the 
present invention. 



According to the present invention there is also provided the use of beta-cyclodextrin 
to purify an enzyme, preferably GL. 

According to the present invention there is also provided a nucleotide sequence 
wherein the DNA sequence is made up of at least a sequence that is the same as, or 
is complementary to, or has substantial homology with, or contains any suitable 
codon substitutions for any of those of, SEQ. ID. No. 3 or SEQ. ID. No. 4, 
preferably wherein the sequence is in isolated form. 

The present invention therefore relates to the isolation of the enzyme a-l,4-glucan 
lyase from a fungus. For example, the fungus can be any one of Discina perlata, 
Discina parma, Gyromitra gigas. Gyromitra infula, Mitrophora hybrida, Morchella 
cornea, Morchella costata, Morchella elata, Morchella hortensis, Morchella rotunda, . 
Morchella vulgaris, Peziza badia, Sarcosphaera eximia.Disciotis venosa, Gyromitra 
esculema, Helvetia crispa, Helvetia lacunosa, Leptopodia elastica, Verpa 
digitaliformis, and other forms of Morchella. Preferably the fungus is Morchella 
costata or Morchella vulgaris. 

The initial enzyme purification can be performed by the method as described by Yu 
et at (ibid). 

However, preferably, the initial enzyme purification includes an optimized procedure 
in which a solid support is used that does not decompose under the purification step 
This gel support further has the advantage that it is compatible with standard 
laboratory protein purification equipment. 

The details of this optimized purification strategy are given later on. The purification 
is terminated by known standard techniques for protein purification. 
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The purity of the enzyme can be readily established using complementary 
electrophoretic techniques. 

The purified lyase GL has been characterized according to pi, temperature- and pH- 
5 optima. 

In this regard the fungal lyase shows a pi around 5.4 as determined by isoelectric 
focusing on gels with pH gradient of 3 to 9. The molecular weight determined by 
SDS-PAGE on 8-25% gradient gels was 110 kDa. The enzyme exhibits a pH 
10 optimum in the range pH 5-7. The temperature optimum was found to lay between 

30-45°C. 



GL sources 


Optimal pH 


Optimal pH range 


Optimal temperature 


M. costata 


6.5 


5.5-7.5 


37 C; 40 C* 


M. vulgaris 


6.4 


5.9-7.6 


43 C; 48 C* 



20 

Parameters determined using glycogen as substrate; other parameters determined 
using amylopectin as substrate. 

In a preferred embodiment the a-1 ,4-glucan lyase is purified from the fungus Morche- 
25 lla costata by affinity chromatography on 0-cyclodextrin Sepharose, ion exchange on 

Mono Q HR 5/5 and gel filtration on Superose 12 columns. 

PAS staining indicates that the fungal lyase was not glycosylated. In the cell-free 
fungus extract, only one form of a-l,4-glucan lyase was detected by activity gel 
30 staining on electrophoresis gels. 
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The enzyme should preferably be secreted to ease its purification. To do so the DNA 
encoding the mature enzyme is fused to a signal sequence, a promoter and a 
terminator from the chosen host. 

5 For expression in Aspergillus niger the gpdA (from the Glyceraldehyde-3-phosphate 
dehydrogenase gene of Aspergillus mdulans) promoter and signal sequence is fused 
to the 5' end of the DNA encoding the mature lyase - such as SEQ LD. No. 3 or 
SEQ. LD. No.4. The terminator sequence from the A. niger trpC gene is placed 3' 
to the gene (Punt, PJ. et al (1991): J. Biotech. 17, 19-34). This construction is 

10 inserted into a vector containing a replication origin and selection origin for E. coli 
and a selection marker for A. niger. Examples of selection markers for A. niger are 
the amdS gene, the argB gene, the pyrG gene, the hygB gene, the BmlR gene which 
all have been used for selection of transformants. This plasmid can be transformed 
into A. niger and the mature lyase can be recovered from the culture medium of the 

15 transformants. 

The construction can be transformed into a protease deficient strain to reduce the 
proteolytic degradation of the lyase in the culture medium (Archer D.B. et al (1992): 
Biotechnol. Lett. 14, 357-362). 

20 

The amino acid composition can be established according to the method of Barholt 
and Jensen (Anal Biochem [1989] vol 177 pp 318-322). The sample for the amino 
acid analysis of the purified enzyme can contain 69ug/ml protein. 

25 The amino acid sequence of the GL enzymes according to the present invention are 

shown in SEQ. LD. No.l and SEQ. LD. No.2. 

The following samples were deposited in accordance with the Budapest Treaty at the 
recognised depositary The National Collections of Industrial and Marine Bacteria 
30 Limited (NCIMB) at 23 St. Machar Drive, Aberdeen, Scotland, United Kingdom, 

AB2 1RY on 3 October 1994: 
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E.Coli containing plasmid pMC (NCIMB 40687) - [ref. DH5alpha-pMC]; 

E.Coli containing plasmid pMVl (NCIMB 40688) - [ref. DH5alpha-pMVl]; and 

E.Coli containing plasmid pMV2 (NCIMB 40689) - [ref. DH5alpha-pMV2] . 

Plasmid pMC is a pBluescript II KS containing a 4. 1 kb fragment isolated from a 
genomic library constructed from Morchella costata. The fragment contains a gene 
coding for a-l,4-glucan lyase. 

Plasmid pMVl is a pBluescript II KS containing a 2.45 kb fragment isolated from a 
genomic library constructed from Morchella vulgaris, The fragment contains the 5' 
end of a gene coding for a-l,4-glucan lyase. 

Plasmid MV2 is a pPUC19 containing a 3.1 kb fragment isolated from a genomic 
library constructed from Morchella vulgaris. The fragment contains the 3' end of a 
gene coding for a-l,4-glucan lyase. 

In the following discussion, MC represents Morchella costata and MV represents 
Morchella vulgaris. 

As mentioned, the GL coding sequence from Morchella vulgaris was contained in two 
plasmids. With reference to Figure 5 (discussed later) pMVl contains the nucleotides 
from position 454 to position 2902; and pMV2 contains the nucleotides downstream 
from (and including) position 2897. With reference to Figures 2 and 3 (discussed 
later), to ligate the coding sequences one can digest pMV2 with restriction enzymes 
EcoRI and BamHI and then insert the relevant fragment into pMVl digested with 
restriction enzymes EcoRI and BamHI. 

Thus highly preferred embodiments of the present invention include a GL enzyme 
obtainable from the expression of the GL coding sequences present in plasmids that 
are the subject of either deposit NCIMB 40687 or deposit NCIMB 40688 and deposit 
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NCIMB 40689. 

The present invention will now be described only by way of example. 

In the following Examples reference is made to the accompanying figures in which: 

Figure 1 shows a plasmid map of pMC; 

Figure 2 shows a plasmid map of pMVl; 

Figure 3 shows a plasmid map of pMV2; 

Figure 4 shows the GL coding sequence and part of the 5' and 3* non-translated 
regions for genomic DNA obtained from Morchella costata; 

Figure 5 shows the GL coding sequence and part of the 5' and 3' non-translated 
regions for genomic DNA obtained from Morchella vulgaris; 

Figure 6 shows a comparison of the GL coding sequences and non-translated regions 
from Morchella costata and Morchella vulgaris; 

Figure 7 shows the amino acid sequence represented as SEQ. I.D. No. 1 showing 
positions of the peptide fragments that were sequenced; and 

Figure 8 shows the amino acid sequence represented as SEQ. I.D. No. 2 showing 
positions of the peptide fragments that were sequenced. 

In more detail, in Figure 4, the total number of bases is 4726 - and the DNA 
sequence composition is: 1336 A; 1070 C; 1051 G; 1269 T. The ATG start codon 
is shown in bold. The introns are underlined. The stop codon is shown in italics. 
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In Figure 5, the total number of bases is 4670 - and the DNA sequence composition 
is: 1253 A; 1072 C; 1080 G; 1265 T. The ATG start codon is shown in bold. The 
introns are underlined. The stop codon is shown in italics. 

5 In Figure 6, the two aligned sequences are those obtained from MC (total number of 

residues: 1066) and MV (total number of residues: 1070). The comparison matrix 
used was a structure-genetic matrix (Open gap cost: 10; Unit gap cost : 2). In this 
Figure, the character to show that two aligned residues are identical is V. The 
character to show that two aligned residues are similar is V. The amino acids said 
10 to be 'similar' are: A,S,T; D,E; N,Q; R,K; I,L,M,V; F,Y,W. Overall there is: 

Identity: 920 (86.30%); Similarity: 51 (4.78%). The number of gaps inserted in MC 
is 1 and the number of gaps inserted in MV is 1. 

In the attached sequence listings: SEQ. I.D.No. 1 is the amino-acid sequence for GL 
15 obtained from Morchella costata; SEQ. I.D.No. 2 is the amino-acid sequence for GL 

obtained from Morchella vulgaris; SEQ. I.D.No. 3 is the nucleotide coding sequence 
for GL obtained from Morchella costata; and SEQ. I.D.No. 4 is the nucleotide 
coding sequence for GL obtained from Morchella vulgaris. 

20 In SEQ. I.D. No. 1 the total number of residues is 1066. The GL enzyme has an 

amino acid composition of: 

46 Ala 13 Cys 25 His 18 Met 73 Thr 

50 Arg 37 Gin 54 He 43 Phe 23 Trp 

25 56 Asn 55 Glu 70 Leu 56 Pro 71 Tyr 

75 Asp 89 Gly 71 Lys 63 Ser 78 Val 
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In SEQ.I.D. No. 2 the total number of residues is 1070. The GL enzyme has 
amino acid composition of: 



51 Ala 


13 Cys 


22 His 


17 Met 


71 Thr 


50 Arg 


40 Gin 


57 He 


45 Phe 


24 Trp 


62 Asn 


58 Glu 


74 Leu 


62 Pro 


69Tyr 


74 Asp 


87 Gly 


61 Lys 


55 Ser 


78 Yal 



1. ENZYME PUR TFTr A TION AND CHARACTERIZATION OF THE nr-1.4- 
GLUCAN LYASE FRO M THE FUNGUS MORCHELLA COSTA TA 

1.1 Materials and Methods 

The fungus Morchella costata was obtained from American Type Culture Collection 
(ATCC). The fungus was grown at 25°C on a shaker using the culture medium 
recommended by ATCC. The mycelia were harvested by filtration and washed with 
0.9% NaCl. 

The fungal cells were broken by homogenization followed by sonication on ice for 
6x3 min in 50 mM citrate-NaOH pH 6.2 (Buffer A). Cell debris werexemoved by 
centrifugation at 25,000xg for 40 min. The supernatant obtained at this procedure 
was regarded as cell-free extract and was used for activity staining and Western 
blotting after separation on 8-25% gradient gels. 

1.2 Separation by 0-cyclodextrin Sepharose gel 

The cell-free extract was applied directly to a 0-cyclodextrin Sepharose gel 4B 
clolumn ( 2.6 x 18 cm) pre equilibrated with Buffer A. The column was washed 
with 3 volumes of Buffer A and 2 volumes of Buffer A containing 1 M NaCl. a-1 ,4- 
glucan lyase was eluted with 2 % dextrins in Buffer A. Active fractions were pooled 
and the buffer changed to 20 mM Bis-tris propane-HCl (pH 7.0, Buffer B). 
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Active fractions were applied onto a Mono Q HR 5/5 column pre-equilibrated with 
Buffer B. The fungal lyase was eluted with Buffer B in a linear gradient of 0.3 M 
NaCl. 

The lyase preparation obtained after 0-cyclodextrin Sepharose chromatography was 
alternatively concentrated to 150 pi and applied on a Superose 12 column operated 
under FPLC conditions. 

1.3 Assay for <x-l,4-glucan lyase activity and conditions for determination of 
substrate specificity, pH and temperature optimum 

The reaction mixture for the assay of the a-1 ,4-glucan lyase activity contained 10 mg 
ml' 1 amylopectin and 25 mM Mes-NaOH (pH 6.0). 

The reaction was carried out at 30 °C for 30 min and stopped by the addition of 3,5- 
dinitrosalicylic acid reagent. Optical density at 550nm was measured after standing 
at room temperature for 10 min. 10 mM EDTA was added to the assay mixture 
when cell-free extracts were used. 

The substrate amylopectin in the assay mixture may be replaced with other substrates 
and the reaction temperature may vary as specified in the text. 

In the pH optimum investigations, the reaction mixture contained amylopection or 
maltotetraose 10 mg ml 1 in a 40 mM buffer. The buffers used were glycine-NaOH 
(pH 2.0-3.5), HoAc-NaoAc (pH 3.5-5.5), Mes-NaOH (pH 5.5-6.7), Mops-NaOH 
(6.0-8.0) and bicine-NaOH (7.6-9.0). The reactions were carried out at 30 °C for 30 
min. The reaction conditions in the temperature optimum investigations was the same 
as above except that the buffer Mops-NaOH (pH 6.0) was used in all experiments. 
The reaction temperature was varied as indicated in the text. 
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SDS-PAGE, Native-PAGE and isoelectrof excusing were performed on PhastSystem 
(Pharmacia, Sweden) using 8-25% gradient gels and gels with a pH gradient of 3-9, 
respectively. Following electrophoresis, the gels were stained by silver staining 
according to the procedures recommended by the manufacturer (Pharmacia). The 
glycoproteins were stained by PAS adapted to the PhastSystem. For activity staining, 
the electrophoresis was performed under native conditions at 6°C. 

Following the electrophoresis, the gel was incubated in the presence of 1 % soluble 
starch at 30°C overnight. Activity band of the fungal lyase was revealed by staining 
with Ij/KI solution. 

1.4 Results 

1.4.1 Purification, molecular mass and isoelectric point of the a-i,4-glucan lyase 

The fungal lyase was found to adsorb on columns packed with 0-cyclodextrin 
Sepharose, starches and Red Sepharose. Columns packed with 0-cyclodextrin 
Sepharose 4B gel and starches were used for purification purposes. 

The lyase preparation obtained by this step contained only minor contaminating 
proteins having a molecular mass higher than the fungal lyase. The impurity was 
either removed by ion exchange chromatography on Mono Q HR 5/5 or more effici- 
ently by gel filtration on Superose 12. 

The purified enzyme appeared colourless and showed ho absorbance in the visible 
light region. The molecular mass was determined to 1 10 kDa as estimated on SDS- 
PAGE. 

The purified fungal lyase showed a isoelectric point of pi 5.4 determined by 
isoelectric focusing on gels with a pH gradient of 3 to 9. In the native 
electrophoresis gels, the enzyme appeared as one single band. This band showed 
starch-degrading activity as detected by activity staining. Depending the age of the 
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culture from which the enzyme is extracted, the enzyme on the native and isoelectric 
focusing gels showed either as a sharp band or a more diffused band with the same 
migration rate and pi. 

1.4.2 The pH and temperature optimum of the fungal lyase catalayzed reaction 

The pH optimum pH range for the fungal lyase catalyzed reaction was found to be 
between pH 5 and pH 7. 

1.4.3 Substrate specificity 

The purified fungal lyase degraded maltosaccharides from maltose to maltoheptaose. 
However, the degradation rates varied. The highest activity achieved was with 
maltotetraose (activity as 100%), followed by maltohexaose (97%), maltoheptaose 
(76%), maltotriose (56%) and the lowest activity was observed with maltose (2%). 

Amylopectin, amylose and glycogen were also degraded by the fungal lyase (% will 
be determined). The fungal lyase was an exo-lyase, not a endolyase as it degraded 
p-nitrophenyl a-D-maltoheptaose but failed to degrade reducing end blocked p- 
nitrophenyl a-D-maltoheptaose. 

1.5 Morchella Vulgaris 

The protocols for the enzyme purification and charaterisation of alpha 1 ,4-glucal lyase 
obtained from Morchella Vulgaris were the same as those above for Morchella 
Costata (with similar results - see results mentioned above). 
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2, AMINO ACID SFOTTFNCING OF TW E ,4-nT.TTP AN T.YASP FPrW 
FUNGUS 



2. 1 Amino acid sequencing of the lyases 

The lyases were digested with either endoproteinase Arg-C from Clostridium 
histolyticum or endoproteinase Lys-C from Lysobacter enzymo genes, both sequencing 
grade purchased from Boehringer Mannheim, Germany. For digestion with 
endoproteinase Arg-C, freezedried lyase (0. 1 mg) was dissolved in 50 /d 10 M urea, 
50 mM methylamine, 0.1 M Tris-HCl, P H 7.6. After overlay with N 2 and addition 
of 10 ,zl of 50 mM DTT and 5 mM EDTA the protein was denatured and reduced for 
10 min at 50°C under N 2 . Subsequently, 1 of endoproteinase Arg-C in 10 »\ of 50 
mM Tris-HCl, pH 8.0 was added, N 2 was overlayed and the digestion was carried out 
for 6h at 37°C. 

For subsequent cysteine derivatization, 12.5 nl 100 mM iodoacetamide was added and 
the solution was incubated for 15 min at RT in the dark under N 2 . 

For digestion with endoproteinase Lys-C, freeze dried lyase (0.1 mg) was dissolved 
in 50 p.\ of 8 M urea, 0.4 M NKLHCO,, pH 8.4. After overlay with N 2 and addition 
of 5 /il of 45 mM DTT, the protein was denatured and reduced for 15 min at 50>C 
under N 2 . After cooling to RT, 5 M l of 100 mM iodoacetamide was added for the 
cysteines to be derivatized for 15 min at RT in the dark under N 2 . Subsequently, 90 
Ml of water and 5 M g of endoproteinase Lys-C in 50 M l of 50 mM tricine and 10 mM 
EDTA, pH 8.0, was added and the digestion was carried out for 24h at 37°C under 
N,. 



The resulting peptides were separated by reversed phase HPLC on a VYDAC C18 
column (0.46 x 15 cm; 10 M m; The Separations Group; California) using solvent A: 
0. 1 % TFA in water and solvent B: 0. 1 % TFA in acetonitrile. Selected peptides were 
rechromatographed on a Develosil C18 column (0.46 x 10 cm; 3 ^m; Dr. Ole Schou, 
Novo Nordisk, Denmark) using the same solvent system prior to sequencing on an 
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Applied Biosystems 476A sequencer using pulsed-liquid fast cycles. 

The amino acid sequence information from the enzyme derived from the fungus 
Morchella costata is shown Fig. 7. 

The amino acid sequence information from the enzyme derived from the fungus 
Morchella vulgaris is shown Fig. 8. 

3. DNA SEQUENCIN G OF GENES CODING FOR THE <y-1.4-GLIJCAN 
LYASE FROM FUNGUS 

3.1 METHODS FOR MOLECULAR BIOLOGY 

DNA was isolated as described by Dellaporte et al (1983 - Plant Mol Biol Rep vol 
1 ppl9-21). 

3.2 PCR 

The preparation of the relevant DNA molecule was done by use of the Gene Amp 
DNA Amplification Kit (Perkin Elmer Cetus, USA) and in accordance with the 
manufactures instructions except that the Taq polymerase was added later (see PCR 
cycles) and the temperature cycling was changed to the following: - 

PCR cycles: 

no of cycles c time (min.) 

1 98 5 

60 5 
addition of Taq polymerase and oil 

35 94 1 
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3.3 CLONING OF PCR FRAGMENTS 

PCR fragments were cloned into pT7Blue (from Novagen) following the instructions 
of the supplier. 

3.4 DNA SEQUENCING 

Double stranded DNA was sequenced essentially according to the dideoxy method of 
Sanger et al. (1979) using the Auto Read Sequencing Kit (Pharmacia) and the 
Pharmacia LKB A.L.F.DNA sequencer. (Ref: Sanger, F., Nicklen, S. and Coulson, 
A.R.(1979). DNA sequencing with chain-deterniinating inhibitors. Proc. Natl. Acad. 
Sci. USA 74: 5463-5467.) 

3.5 SCREENING OF THE LIBRARIES 

Screening of the Lambda Zap libraries obtained from Stratagene, was performed in 
accordance with the manufacturer's instructions except that the prehybridization and 
hybridization was performed in 2xSSC, 0.1% SDS, lOxDenhardt's and l(XVg/ml 
denatured salmon sperm DNA. 

To the hybridization solution a 32P-labeled denatured probe was added. Hybridization 
was performed over night at 55°C. The filters were washed twice in 2xSSC, 0.1% 
SDS and twice in lxSSC, 0. 1 % SDS. 



3.6 PROBE 



The cloned PCR fragments were isolated from the pT7blue vector by digestion with 
appropriate restriction enzymes. The fragments were seperated from the vector by 
agarose gel electrophoresis and the fragments were purified from the agarose by 
Agarase (Boehringer Mannheim). As the fragments were only 90-240 bp long the 
isolated fragments were exposed to a ligation reaction before labelling with 32P-dCTP 
using either Prime-It random primer kit (Stratagene) or Ready to Go DNA labelling 
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kit (Pharmacia). 
3.7 RESULTS 

3.7.1 Generation of PCR DNA fragments coding for a-l,4-glucan lyase. 

The amino acid sequences (shown below) of three overlapping tryptic peptides from 
a-l,4-glucan lyase were used to generate mixed oligonucleotides, which could be 
used as PCR primers for amplification of DNA isolated from both MC and MV. 

Lys Asn Leu His Pro Gin His Lys Met Leu Lys Asp Thr Val Leu Asp He Val Lys 
Pro Gly His Gly Glu Tyr Val Gly Trp Gly Glu Met Gly Gly lie Gin Phe Met Lys 
Glu Pro Thr Phe Met Asn Tyr Phe Asn Phe Asp Asn Met Gin Tyr Gin Gin Val Tyr 
Ala Gin Gly Ala Leu Asp Ser Arg Glu Pro Leu Tyr His Ser Asp Pro Phe Tyr 

In the first PCR amplification primers A1/A2 (see below) were used as upstream 
primers and primers B1/B2 (see below) were used as downstream primer. 

Primer Al: CA(GA)CA(CT)AA(GA)ATGCT(GATC)AA(GA)GA(CT)AC 
Primer A2: CA(GA)CA(CT)AA(GA)ATGTT(GA)AA(GA)GA(CT)AC , 
Primer Bl: TA(GA)AA(GATC)GG(GA)TC(GA)CT(GA)TG(GA)TA 
Primer B2: TA(GA)AA(GATC)GG(GA)TC(GATC)GA(GA)TG(GA)TA 

The PCR products were analysed on a 2% LMT agarose gel and fragments of the 
expected sizes were cut out from the gel and treated with Agarase (Boehringer 
Manheim) and cloned into the pT7blue Vector (Novagen) and sequenced. 

The cloned fragments from the PCR amplification coded for amino acids 
corresponding to the sequenced peptides (see above) and in each case in addition to 
two intron sequences. For MC the PCR amplified DNA sequence corresponds to the 
sequence shown as from position 1202 to position 1522 with reference to Figure 4. 
For MV the PCR amplified DNA sequence corresponds to the sequence shown as 
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from position 1218 to position 1535 with reference to Figure 5. 

3.7.2 Screening of the genomic libraries with the cloned PCR fragments. 

Screening of the libraries with the above-mentioned clone gave two clones for each 
source. For MC the two clones were combined to form the sequence shown in 
Figure 4 (see below). For MV the two clones could be combined to form the 
sequence shown in Figure 5 in the manner described above. 

An additional PCR was performed to supplement the MC clone with PstI, PvuH, AscI 

and Ncol restriction sites immediately in front of the ATG start codon using the 

following oligonucleotide as an upstream primer: 

AAACTGCAGCTGGCGCGCCATQGCAGGATTTTCTGAT 

and a primer containing the complement sequence of bp 1297-1318 in Figure 4 was 

used as a downstream primer. 

The complete sequence for MC was generated by cloning the 5' end of the gene as 
a BgUI-EcoRI fragment from one of the genomic clone (first clone) into the BamHI- 
EcoRI sites of pBluescript II KS+ vector from Stratagene. The 3' end of the gene 
was then cloned into the modified pBluescript II KS+ vector by ligating an NspV 
(blunt ended, using the DNA blunting kit from Amersham International)-EcoRI 
fragment from the other genomic clone (second clone) after the modified pBluescript 
II KS+ vector had been digested with EcoRI and EcoRV. Then the intermediate pan 
of the gene was cloned in to the further modified pBluescript II KS+ vector as an 
EcoRI fragment from the first clone by ligating that fragment into the further 
modified pBluescript II KS+ vector digested with EcoRI. 

4. EXPRESSION OF THF GL GENF. TN MICRO-ORG ANTSMS 

The DNA sequence encoding the GL can be introduced into microorganisms to 
produce the enzyme with high specific activity and in large quantities. 



WO 95/10617 



PCT/EP94/03398 



18 

In this regard, the MC gene (Figure 4) was cloned as a Xbal-Xhol blunt ended (using 
the DNA blunting kit from Amersham International) fragment into the Pichia 
expression vector pHIL-D2 (containing the AOX1 promoter) digested with EcoRI and 
blunt ended (using the DNA blunting kit from Amersham International) for expression 
in Pichia pastoris (according to the protocol stated in the Pichia Expression Kit 
supplied by Invitrogen). 

In another embodiment, the MC gene 1 (same as Figure 4 except that it was modified 
by PCR to introduce restriction sites as described above) was cloned as a PvuH-XhoI 
blunt ended fragment (using the DNA blunting kit from Amersham International) into 
the Aspergillus expression vector pBARMTEl (containing the methyl tryptophan 
resistance promoter from Neuropera crassa) digested with Smal for expression in 
Aspergillus niger (Pall et al (1993) Fungal Genet Newslett. vol 40 pages 59-62). The 
protoplasts were prepared according to Daboussi et al (Curr Genet (1989) vol 15 pp 
453-456) using lysing enzymes Sigma L-2773 and the lyticase Sigma L-8012. The 
transformation of the protoplasts was followed according to the protocol stated by 
Buxton et al (Gene (1985) vol 37 pp 207-214) except that for plating the transformed 
protoplasts the protocol laid out in Punt et al (Methods in Enzymology (1992) vol 216 
pp 447 - 457) was followed but with the use of 0.6% osmotic stabilised top agarose. 

The results showed that lyase activity was observed in the transformed Pichia pastoris 
and Aspergillus niger. These experiments are now described. 

ANALYSES OF PICHIA LYASE TRANSFORMANTS AND ASPERGILLUS 
LYASE TRANSFORMANTS 

GENERAL METHODS 

Preparation of cell-free extracts. 

The cells were harvested by centrifugation at 9000 rpm for 5 min and washed with 
0.9% NaCl and resuspended in the breaking buffer (50mM K-phosphate, pH 7.5 
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containing ImM of EDTA, and 5% glycerol). Cells were broken using glass beads 
and vortex treatment. The breaking buffer contained 1 mM PMSF (protease inhibi- 
tor). The lyase extract (supernatant) was obtained after centrifugation at 9000 rpm for 
5 min followed by centrifugation at 20,000 xg for 5min. 

Assay of lyase activity by alkaline 3,5-dinitrosalicylic acid reagent (DNS) 

One volume of lyase extract was mixed with an equal volume of 4% amylopectin 
solution. The reaction mixture was then incubated at a controlled temperature and 
samples vere removed at specified intervals and analyzed for AF. 

The lyase activity was also analyzed using a radioactive method. 

The reaction mixture contained 10 /xl 14 C-starch solution (1 nd; Sigma Chemicals 
Co.) and 10 fi\ of the lyase extract. The reaction mixture was left at 25°C overnight 
and was then analyzed in the usual TLC system. The radioactive AF produced was 
detected using an Instant Imager (Pachard Instrument Co., Inc., Meriden, CT). 

Electrophoresis and Western blotting 

SDS-PAGE was performed using 8-25% gradient gels and the PhastSystem 
(Pharmacia). Western blottings was also run on a Semidry transfer unit of the 
PhastSystem. Primary antibodies raised against the lyase purified from the red 
seaweed collected at Qingdao (China) were used in a dilution of 1 : 100. Pig antirabbit 
IgG conjugated to alkaline phosphatase (Dako A/S, Glostrup, Denmark) were used 
as secondary antibodies and used in a dilution of 1:1000. 
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Part I, Analysis of the Pichia transfornvantscontaining the above mentioned 
construct 



5 

MC-Lyase expressed intracellularly in Pichia pastoris 



10 



Names of culture 


Specific activity* 


A18 


10 


A20 


32 


A21 


8 


A22 


8 


A24 


6 



"The specific activity was defined as nmol of AF produced per min per mg protein 
at 25°C. 

Part II, The Aspergillus transformants 

25 
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Results 

I. Lyase activity was determined after 5 days incubation(miaimal medium 
containing 0.2% casein enzymatic hydrolysate analysis by the alkaline 3,5- 
5 dinitrosalicylic acid reagent 



Lyase activity analysis in cell-free extracts 



10 



Name of the culture 


Specific activity* 


8.13 


11 


8.16 


538 


8.19 


37 



"The specific activity was defined as nmol of AF produced per min per mg protein 
at 25°C. 

The results show that the MC-lyase was expressed intracellular in A. niger. 

Instead of Aspergillus niger as host, other industrial important nicroorganisms for 
which good expression systems are known could be used such as: Aspergillus oryzae, 
Aspergillus sp., Trichoderma sp., Saccharomyces cerevisiae, Kluyveromyces sp., 
Hansenula sp., Pichia sp., Bacillus subtilis, B. amyloliquefaciens. Bacillus sp., 
Streptomyces sp. or E. coli. 

Other preferred embodiments of the present invention include any one of the 
following: A transformed host organism having the capability of producing AF as 
a consequence of the introduction of a DNA sequence as herein described; such a 
transformed host organism which is a microorganism - preferably wherein the host 
organism is selected from the group consisting of bacteria, moulds, fungi and yeast; 
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preferably the host organism is selected from the group consisting of Saccharomyces, 
Kluyveromyces, Aspergillus. Trichoderma Hansenula, Pichia, Bacillus Streptomyces, 
Eschericia such as Aspergillus oryzae, Saccharomyces cerevisiae, bacillus sublilis, 
Bacillus amyloliquefascien, Eschericia coli.; A method for preparing the sugar 1,5-D- 
anhydrofmctose comprising contacting an alpha 1,4-glucan (e.g. starch) with the 
enzyme a-l,4-glucan lyase expressed by a transformed host organism comprising a 
nucleotide sequence encoding the same, preferably wherein the nucleotide sequence 
is a DNA sequence, preferably wherein the DNA sequence is one of the sequences 
hereinbefore described; A vector incorporating a nucleotide sequence as hereinbefore 
described, preferably wherein the vector is a replication vector, preferably wherein 
the vector is an expression vector containing the nucleotide sequence downstream 
from a promoter sequence, preferably the vector contains a marker (such as a 
resistance marker); Cellular organisms, or cell line, transformed with such a vector; 
A method of producing the product a- 1,4-glucan lyase or any nucleotide sequence or 
part thereof coding for same, which comprises culturing such an organism (or cells 
from a cell line) transfected with such a vector and recovering the product. 

Other modifications of the present invention will be apparent to those skilled in the 
art without departing from the scope of the invention. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 
(i) APPLICANT: 

(A) NAME: Danisco A/S 

(B) STREET: Langebrogade 1 

(C) CITY: Copenhagen 

(D) STATE: Copenhagen K 

(E) COUNTRY: Denmark 

,.. % J F > P0STAL CODE (ZIP): DK-1001 
(n) TITLE OF INVENTION: ENZYME 
(iii) NUMBER OF SEQUENCES: 10 
(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(v, Ai! , Si A K; t,e ' S! Version < EP0 > 

APPLICATION NUMBER: WO PCT/EP94/03398 

(2) INFORMATION FOR SEQ ID NO- 1- 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1066 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met Ala Gly Phe Ser Asp Pro Leu Asn Phe Cys Lys Ala Glu Asp Tyr 

Tyr Ser Val Ala Leu Asp Trp Lys Gly Pro Gin Lys lie lie Gly Val 
^° 25 30 

Asp Thr Thr Pro Pro Lys Ser Thr Lys Phe Pro Lys Asn Trp His Gly 

40 45 

Val Asn Leu Arg Phe Asp Asp Gly Thr Leu Gly Val Val Gin Phe He 
3U 55 60 

Arg Pro Cys Val Trp Arg Val Arg Tyr Asp Pro Gly Phe Lys Thr Ser 

70 75 80 

Asp Glu Tyr Gly Asp Glu Asn Thr Arg Thr He Val Gin Asp Tyr Met 
ob 90 95 

Ser Thr Leu Ser Asn Lys Leu Asp Thr Tyr Arg Gly Leu Thr Trp Glu 



110 



Thr Lys Cys Glu Asp Ser Gly Asp Phe Phe Thr Phe Ser Ser Lys Val 
lib 120 125 

Thr Ala Val Glu Lys Ser Glu Arg Thr Arg Asn Lys Val Gly Asp Gly 
1JU 135 140 
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Leu Arg lie His Leu Trp Lys Ser Pro Phe Arg He Gin Val Val Arq 
145 150 155 160 

Thr Leu Thr Pro Leu Lys Asp Pro Tyr Pro He Pro Asn Val Ala Ala 
165 170 175 

Ala Glu Ala Arg Val Ser Asp Lys Val Val Trp Gin Thr Ser Pro Lys 
180 185 igo 

Thr Phe Arg Lys Asn Leu His Pro Gin His Lys Met Leu Lys Asp Thr 
195 200 205 

Val Leu Asp He Val Lys Pro Gly His Gly Glu Tyr Val Gly Trp Gly 
210 215 220 

Glu Met Gly Gly He Gin Phe Met Lys Glu Pro Thr Phe Met Asn Tyr 
225 230 235 240 

Phe Asn Phe Asp Asn Met Gin Tyr Gin Gin Val Tyr Ala Gin Gly Ala 
245 250 255 

Leu Asp Ser Arg Glu Pro Leu Tyr His Ser Asp Pro Phe Tyr Leu Asp 
260 265 270 

Val Asn Ser Asn Pro Glu His Lys Asn He Thr Ala Thr Phe He Asd 
275 280 285 

Asn Tyr Ser Gin lie Ala He Asp Phe Gly Lys Thr Asn Ser Gly Tyr 
290 295 300 

He Lys Leu Gly Thr Arg Tyr Gly Gly He Asp Cys Tyr Gly He Ser 
305 310 315 320 

Ala Asp Thr Val Pro Glu He Val Arg Leu Tyr Thr Gly Leu Val Gly 
325 330 335 

Arg Ser Lys Leu Lys Pro Arg Tyr lie Leu Gly Ala His Gin Ala Cys 
340 345 350 

Tyr Gly Tyr Gin Gin Glu Ser Asp Leu Tyr Ser Val Val Gin Gin Tyr 
355 360 365 

Arg Asp Cys Lys Phe Pro Leu Asp Gly He His Val Asp Val Asp Val 
J/u 375 380 

Gin Asp Gly Phe Arg Thr Phe Thr Thr Asn Pro His Thr Phe Pro Asn 
385 390 395 400 

Pro Lys Glu Met Phe Thr Asn Leu Arg Asn Asn Gly He Lys Cys Ser 
405 410 415 

Thr Asn He Thr Pro Val He Ser He Asn Asn Arg Glu Gly Gly Tyr 
420 425 430 

Ser Thr Leu Leu Glu Gly Val Asp Lys Lys Tyr Phe He Met Asp Asp 
435 440 445 w y 
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Arg Tyr Thr Glu Gly Thr Ser Gly Asn Ala Lys Asp Val Arg Tyr Met 
H3U 455 460 

Tyr Tyr Gly Gly Gly Asn Lys Val Glu Val Asp Pro Asn Asp Val Asn 

470 475 480 

Gly Arg Pro Asp Phe Lys Asp Asn Tyr Asp Phe Pro Ala Asn Phe Asn 
4 85 490 495 

Ser Lys Gin Tyr Pro Tyr His Gly Gly Val Ser Tyr Gly Tyr Gly Asn 
ouu 505 510 

Gly Ser Ala Gly Phe Tyr Pro Asp Leu Asn Arg Lys Glu Val Arg He 
tib 520 525 

Trp Trp Gly Met Gin Tyr Lys Tyr Leu Phe Asp Met Gly Leu Glu Phe 
0JU 535 540 

Val Trp Gin Asp Met Thr Thr Pro Ala He His Thr Ser Tyr Gly Asp 

"° 555 560 

Met Lys Gly Leu Pro Thr Arg Leu Leu Val Thr Ser Asp Ser Val Thr 
=05 570 575 

Asn Ala Ser Glu Lys Lys Leu Ala He Glu Thr Trp Ala Leu Tyr Ser 
580 585 5 g 0 

Tyr Asn Leu His Lys Ala Thr Trp His Gly Leu Ser Arg Leu Glu Ser 

Arg Lys Asn Lys Arg Asn Phe He Leu Gly Arg Gly Ser Tyr Ala Gly 

615 520 

Ala Tyr Arg Phe Ala Gly Leu Trp Thr Gly Asp Asn Ala Ser Asn Trp 

630 635 640 

• G1U Phe Trp Lys Ht Ser Val Ser Gin Val Leu Ser Leu Gly Leu Asn 

650 655 

Gly Val Cys lie Ala Gly Ser Asp Thr Gly Gly Phe Glu Pro Tyr Arg 
560 665 670 

Asp Ala Asn Gly Val Glu Glu Lys Tyr Cys Ser Pro Glu Leu Leu lie 

Arg Trp Tyr Thr Gly Ser Phe Leu Leu Pro Trp Leu Arg Asn His Tyr 

695 700 

Val Lys Lys Asp Arg Lys Trp Phe Gin Glu Pro Tyr Ser Tyr Pro Lys 

/10 715 720 

His Leu Glu Thr His Pro Glu Leu Ala Asp Gin Ala Trp Leu Tyr Lys 

730 735 

Ser Val Leu Glu He Cys Arg Tyr Tyr Val Glu Leu Arg Tyr Ser Leu 
/w 745 750 
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He Gin Leu Leu Tyr Asp Cys Met Phe Gin Asn Val Val Asp Gly Met 
755 760 765 

Pro lie Thr Arg Ser Met Leu Leu Thr Asp Thr Glu Asp Thr Thr Phe 
770 775 780 

Phe Asn Glu Ser Gin Lys Phe Leu Asp Asn Gin Tyr Met Ala Gly Asp 
785 790 795 800 

Asp He Leu Val Ala Pro He Leu His Ser Arg Lys Glu He Pro Gly 
805 810 815 

Glu Asn Arg Asp Val Tyr Leu Pro Leu Tyr His Thr Trp Tyr Pro Ser 
820 825 830 

Asn Leu Arg Pro Trp Asp Asp Gin Gly Val Ala Leu Gly Asn Pro Val 
835 840 845 

Glu Gly Gly Ser Val He Asn Tyr Thr Ala Arg He Val Ala Pro Glu 
850 855 860 

Asp Tyr Asn Leu Phe His Ser Val Val Pro Val Tyr Val Arg Glu Gly 
86 5 870 875 880 

Ala He He Pro Gin He Glu Val Arg Gin Trp Thr Gly Gin Gly Gly 
885 890 895 

Ala Asn Arg He Lys Phe Asn He Tyr Pro Gly Lys Asp Lys Glu Tyr 
900 905 910 

Cys Thr Tyr Leu Asp Asp Gly Val Ser Arg Asp Ser Ala Pro Glu Asd 
915 920 925 

Leu Pro Gin Tyr Lys Glu Thr His Glu Gin Ser Lys Val Glu Gly Ala 
930 935 940 

Glu He Ala Lys Gin He Gly Lys Lys Thr Gly Tyr Asn He Ser Gly 
945 950 955 960 

Thr Asp Pro Glu Ala Lys Gly Tyr His Arg Lys Val Ala Val Thr Gin 
965 970 975 

Thr Ser Lys Asp Lys Thr Arg Thr Val Thr He Glu Pro Lys His Asn 
980 985 990 

Gly Tyr Asp Pro Ser Lys Glu Val Gly Asp Tyr Tyr Thr He He Leu 
995 1000 1005 

Trp Tyr Ala Pro Gly Phe Asp Gly Ser lie Val Asp Val Ser Lys Thr 
1010 1015 1020 

Thr Val Asn Val Glu Gly Gly Val Glu His Gin Val Tyr Lys Asn Ser 
1° 25 1030 1035 1040 

Asp Leu His Thr Val Val He Asp Val Lys Glu Val He Gly Thr Thr 
1045 1050 1055 
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Lys Ser Val Lys He Thr Cys Thr Ala Ala 
1060 1065 

(2) INFORMATION FOR SEQ ID NO- 2- 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1070 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO- 2- 



Met Ala Gly Leu Ser Asp Pro Leu Asn Phe Cys Lys Ala Glu Asp Tyr 
b 10 15 



Tyr Ala Ala Ala Lys Gly Trp Ser Gly Pro Gin Lys He He Arg Tyr 

25 30 

Asp Gin Thr Pro Pro Gin Gly Thr Lys Asp Pro Lys Ser Trp His Ala 
J3 40 45 

Val Asn Leu Pro Phe Asp Asp Gly Thr Met Cys Val Val Gin Phe Val 



60 



Arg Pro Cys Val Trp Arg Val Arg Tyr Asp Pro Ser Val Lys Thr Ser 

75 80 

Asp Glu Tyr Gly Asp Glu Asn Thr Arg Thr He Val Gin Asp Tyr Met 
85 90 g5 

Thr Thr Leu Val Gly Asn Leu Asp He Phe Arg Gly Leu Thr Trp Val 
100 105 110 

Ser Thr Leu Glu Asp Ser Gly Glu Tyr Tyr Thr Phe Lys Ser Glu Val 
113 120 125 

Thr Ala Val Asp Glu Thr Glu Arg Thr Arg Asn Lys Val Gly Asp Gly 

loo 

Leu Lys He Tyr Leu Trp Lys Asn Pro Phe Arg He Gin Val Val Arg 

Ii)U 155 160 

Leu Leu Thr Pro Leu Val Asp Pro Phe Pro He Pro Asn Val Ala Asn 
165 170 175 

Ala Thr Ala Arg Val Ala Asp Lys Val Val Trp Gin Thr Ser Pro Lys 
180 185 i9 0 y 

Thr Phe Arg Lys Asn Leu His Pro Gin His Lys Met Leu Lys Asp Thr 

Val Leu Asp lie He Lys Pro Gly His Gly Glu Tyr Val Gly Trp Gly 

215 220 

Glu Met Gly Gly He Glu Phe Met Lys Glu Pro Thr Phe Met Asn Tyr 

230 235 240 
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Phe Asn Phe Asp Asn Met Gin Tyr Gin Gin Val Tyr Ala Gin Gly Ala 
245 250 255 

Leu Asp Ser Arg Glu Pro Leu Tyr His Ser Asp Pro Phe Tyr Leu Asp 
260 265 270 

Val Asn Ser Asn Pro Glu His Lys Asn He Thr Ala Thr Phe He Asp 
275 280 285 

Asn Tyr Ser Gin He Ala lie Asp Phe Gly Lys Thr Asn Ser Gly Tyr 
290 295 300 

lie Lys Leu Gly Thr Arg Tyr Gly Gly He Asp Cys Tyr Gly He Ser 
305 310 315 320 

Ala Asp Thr Val Pro Glu He Val Arg Leu Tyr Thr Gly Leu Val Gly 
325 330 335 

Arg Ser Lys Leu Lys Pro Arg Tyr He Leu Gly Ala His Gin Ala Cys 
340 345 350 

Tyr Gly Tyr Gin Gin Glu Ser Asp Leu His Ala Val Val Gin Gin Tyr 
355 360 365 

Arg Asp Thr Lys Phe Pro Leu Asp Gly Leu His Val Asp Val Asp Phe 
370 375 380 

Gin Asp Asn Phe Arg Thr Phe Thr Thr Asn Pro He Thr Phe Pro Asn 
385 390 395 400 

Pro Lys Glu Met Phe Thr Asn Leu Arg Asn Asn Gly He Lys Cys Ser 
405 410 415 

Thr Asn lie Thr Pro Val He Ser He Arg Asp Arg Pro Asn Gly Tyr 
420 425 430 

Ser Thr Leu Asn Glu Gly Tyr Asp Lys Lys Tyr Phe He Met Asp Asp 
435 440 445 

Arg Tyr Thr Glu Gly Thr Ser Gly Asp Pro Gin Asn Val Arg Tyr Ser 
450 455 460 

Phe Tyr Gly Gly Gly Asn Pro Val Glu Val Asn Pro Asn Asp Val Trp 
465 470 475 480 

Ala Arg Pro Asp Phe Gly Asp Asn Tyr Asp Phe Pro Thr Asn Phe Asn 
485 490 495 

Cys Lys Asp Tyr Pro Tyr His Gly Gly Val Ser Tyr Gly Tyr Gly Asn 
500 505 510 

Gly Thr Pro Gly Tyr Tyr Pro Asp Leu Asn Arg Glu Glu Val Arg He 
515 520 525 

Trp Trp Gly Leu Gin Tyr Glu Tyr Leu Phe Asn Met Gly Leu Glu Phe 
530 535 540 
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Val Trp Gin Asp Met Thr Thr Pro Ala He His Ser Ser Tyr Gly Asp 

550 555 560 

Met Lys Gly Leu Pro Thr Arg Leu Leu Val Thr Ala Asp Ser Val Thr 
565 570 575 

Asn Ala Ser Glu Lys Lys Leu Ala He Glu Ser Trp Ala Leu Tyr Ser 
580 585 590 

Tyr Asn Leu His Lys Ala Thr Phe His Gly Leu Gly Arg Leu Glu Ser 
3yi » 600 605 

Arg Lys Asn Lys Arg Asn Phe He Leu Gly Arg Gly Ser Tyr Ala Gly 
olu 615 620 

Ala Tyr Arg Phe Ala Gly Leu Trp Thr Gly Asp Asn Ala Ser Thr Trp 

630 635 640 

Glu Phe Trp Lys lie Ser Val Ser Gin Val Leu Ser Leu Gly Leu Asn 
645 650 655 

Gly Val Cys lie Ala Gly Ser Asp Thr Gly Gly Phe Glu Pro Ala Arg 
660 665 670 

Thr Glu lie Gly Glu Glu Lys Tyr Cys Ser Pro Glu Leu Leu He Arg 
575 680 6 35 y 

Trp Tyr Thr Gly Ser Phe Leu Leu Pro Trp Leu Arg Asn His Tyr Val 

u 695 



700 



Lys Lys Asp Arg Lys Trp Phe Gin Glu Pro Tyr Ala Tyr Pro Lys His 

/i0 715 720 

Leu Glu Thr His Pro Glu Leu Ala Asp Gin Ala Trp Leu Tyr Lys Ser 

730 735 

Val Leu Glu He Cys Arg Tyr Trp Val Glu Leu Arg Tyr Ser Leu He 
/40 745 750 

Gin Leu Leu Tyr Asp Cys Met Phe Gin Asn Val Val Asp Gly Met Pro 
/5:> 760 765 

Leu Ala Arg Ser Met Leu Leu Thr Asp Thr Glu Asp Thr Thr Phe Phe 



780 



Asn Glu Ser Gin Lys Phe Leu Asp Asn Gin Tyr Met Ala Gly Asp Asp 

/90 795 800 

He Leu Val Ala Pro He Leu His Ser Arg Asn Glu Val Pro Gly Glu 
805 810 81 | 

Asn Arg Asp Val Tyr Leu Pro Leu Phe His Thr Trp Tyr Pro Ser Asn 
820 825 830 

Leu Arg Pro Trp Asp Asp Gin Gly Val Ala Leu Gly Asn Pro Val Glu 
835 840 845 
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Gly Gly Ser Val He Asn Tyr Thr Ala Arg He Val Ala Pro Glu Asd 
850 855 860 

Tyr Asn Leu Phe His Asn Val Val Pro Val Tyr He Arg Glu Gly Ala 
865 870 875 880 

He He Pro Gin He Gin Val Arg Gin Trp He Gly Glu Gly Gly Pro 
885 890 895 

Asn Pro He Lys Phe Asn He Tyr Pro Gly Lys Asp Lys Glu Tyr Val 
900 905 910 

Thr Tyr Leu Asp Asp Gly Val Ser Arg Asp Ser Ala Pro Asp Asp Leu 
915 920 925 

Pro Gin Tyr Arg Glu Ala Tyr Glu Gin Ala Lys Val Glu Gly Lys Asd 
930 935 940 

Val Gin Lys Gin Leu Ala Val He Gin Gly Asn Lys Thr Asn Asp Phe 
945 950 955 960 

Ser Ala Ser Gly He Asp Lys Glu Ala Lys Gly Tyr His Arg Lys Val 
965 970 975 

Ser He Lys Gin Glu Ser Lys Asp Lys Thr Arg Thr Val Thr He Glu 
980 985 990 

Pro Lys His Asn Gly Tyr Asp Pro Ser Lys Glu Val Gly Asn Tyr Tyr 
995 1000 1005 

Thr Jl?« Ile Leu Trp Jyr Ala Pro G] y Pne A SP Gly Ser He Val Asp 
1010 1015 1020 

Val Ser Gin Ala Thr Val Asn He Glu Gly Gly Val Glu Cys Glu He 
1025 1030 1035 1040 

Phe Lys Asn Thr Gly Leu His Thr Val Val Val Asn Val Lys Glu Val 
1045 1050 1055 

He Gly Thr Thr Lys Ser Val Lys He Thr Cys Thr Thr Ala 
1060 1065 1070 

(2) INFORMATION FOR SEQ ID NO: 3- 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3201 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGGCAGGAT TTTCTGATCC TCTCAACTTT TGCAAAGCAG AAGACTACTA CAGTGTTGCG 60 
CTAGACTGGA AGGGCCCTCA AAAAATCATT GGAGTAGACA CTACTCCTCC AAAGAGCACC 120 
AAGTTCCCCA AAAACTGGCA TGGAGTGAAC TTGAGATTCG ATGATGGGAC TTTAGGTGTG 180 
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GTTCAGTTCA TTAGGCCGTG CGTTTGGAGG GTTAGATACG ACCCTGGTTT CAAGACCTCT 240 

GACGAGTATG GTGATGAGAA TACGAGGACA ATTGTGCAAG ATT AT AT GAG TACTCTGAGT 300 

AATAAATTGG ATACTTATAG AGGTCTTACG TGGGAAACCA AGTGTGAGGA TTCGGGAGAT 360 

TTCTTTACCT TCTCATCCAA GGTCACCGCC GTTGAAAAAT CCGAGCGGAC CCGCAACAAG 420 

GTCGGCGATG GCCTCAGAAT TCACCTATGG AAAAGCCCTT TCCGCATCCA AGTAGTGCGC 480 

ACCTTGACCC CTTTGAAGGA TCCTTACCCC ATTCCAAATG TAGCCGCAGC CGAAGCCCGT 540 

GTGTCCGACA AGGTCGTTTG GCAAACGTCT CCCAAGACAT TCAGAAAGAA CCTGCATCCG 600 

CAACACAAGA TGCTAAAGGA TACAGTTCTT GACATTGTCA AACCTGGACA TGGCGAGTAT 660 

GTGGGGTGGG GAGAGATGGG AGGTATCCAG TTTATGAAGG AGCCAACATT CATGAACTAT 720 

TTTAACTTCG ACAATATGCA ATACCAGCAA GTCTATGCCC AAGGTGCTCT CGATTCTCGC 780 

GAGCCACTGT ACCACTCGGA TCCCTTCTAT CTTGATGTGA ACTCCAACCC GGAGCACAAG 840 

AATATCACGG CAACCTTTAT CGATAACTAC TCTCAAATTG CCATCGACTT TGGAAAGACC 900 

AACTCAGGCT ACATCAAGCT GGGAACCAGG TATGGTGGTA TCGATTGTTA CGGTATCAGT 960 

GCGGATACGG TCCCGGAAAT TGTACGACTT TATACAGGTC TTGTTGGACG TTCAAAGTTG 1020 

AAGCCCAGAT ATATTCTCGG GGCCCATCAA GCCTGTTATG GATACCAACA GGAAAGTGAC 1080 

TTGTATTCTG TGGTCCAGCA GTACCGTGAC TGTAAATTTC CACTTGACGG GATTCACGTC 1140 

GATGTCGATG TTCAGGACGG CTTCAGAACT TTCACCACCA ACCCACACAC TTTCCCTAAC 1200 

CCCAAAGAGA TGTTTACTAA CTTGAGGAAT AATGGAATCA AGTGCTCCAC CAATATCACT 1260 

CCTGTTATCA GCATTAACAA CAGAGAGGGT GGATACAGTA CCCTCCTTGA GGGAGTTGAC 1320 

AAAAAATACT TTATCATGGA CGACAGATAT ACCGAGGGAA CAAGTGGGAA TGCGAAGGAT 1380 
GTTCGGTACA TGTACTACGG TGGTGGTAAT AAGGTTGAGG TCGATCCTAA TGATGTTAAT 
GGTCGGCCAG ACTTTAAAGA CAACTATGAC TTCCCCGCGA ACTTCAACAG CAAACAATAC 

CCCTATCATG GTGGTGTGAG CTACGGTTAT GGGAACGGTA GTGCAGGTTT TTACCCGGAC 1560 

CTCAACAGAA AGGAGGTTCG TATCTGGTGG GGAATGCAGT ACAAGTATCT CTTCGATATG 1620 

GGACTGGAAT TTGTGTGGCA AGACATGACT ACCCCAGCAA TCCACACATC ATATGGAGAC 1680 

ATGAAAGGGT TGCCCACCCG TCTACTCGTC ACCTCAGACT CCGTCACCAA TGCCTCTGAG 1740 

AAAAAGCTCG CAATTGAAAC TTGGGCTCTC TACTCCTACA ATCTCCACAA AGCAACTTGG 1800 

CATGGTCTTA GTCGTCTCGA ATCTCGTAAG AACAAACGAA ACTTCATCCT CGGGCGTGGA 1860 
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AGTTATGCCG GAGCCTATCG TTTTGCTGGT CTCTGGACTG GGGATAATGC AAGTAACTGG 1920 

GAATTCTGGA AGATATCGGT CTCTCAAGTT CTTTCTCTGG GCCTCAATGG TGTGTGCATC 1980 

GCGGGGTCTG ATACGGGTGG TTTTGAACCC TACCGTGATG CAAATGGGGT CGAGGAGAAA 2040 

TACTGTAGCC CAGAGCTACT CATCAGGTGG TATACTGGTT CATTCCTCTT GCCGTGGCTC 2100 

AGGAACCATT ATGTCAAAAA GGACAGGAAA TGGTTCCAGG AACCATACTC GTACCCCAAG 2160 

CATCTTGAAA CCCATCCAGA ACTCGCAGAC CAAGCATGGC TCTATAAATC CGTTTTGGAG 2220 

ATCTGTAGGT ACTATGTGGA GCTTAGATAC TCCCTCATCC AACTACTTTA CGACTGCATG 2280 

TTTCAAAACG TAGTCGACGG TATGCCAATC ACCAGATCTA TGCTCTTGAC CGATACTGAG 2340 

GATACCACCT TCTTCAACGA GAGCCAAAAG TTCCTCGACA ACCAATATAT GGCTGGTGAC 2400 

GACATTCTTG TTGCACCCAT CCTCCACAGT CGCAAAGAAA TTCCAGGCGA AAACAGAGAT 2460 

GTCTATCTCC CTCTTTACCA CACCTGGTAC CCCTCAAATT TGAGACCATG GGACGATCAA 2520 

GGAGTCGCTT TGGGGAATCC TGTCGAAGGT GGTAGTGTCA TCAATTATAC TGCTAGGATT 2580 

GTTGCACCCG AGGATTATAA TCTCTTCCAC AGCGTGGTAC CAGTCTACGT TAGAGAGGGT 2640 

GCCATCATCC CGCAAATCGA AGTACGCCAA TGGACTGGCC AGGGGGGAGC CAACGGCATC 2700 

AAGTTCAACA TCTACCCTGG AAAGGATAAG GAGTACTGTA CCTATCTTGA TGATGGTGTT 2760 

AGCCGTGATA GTGCGCCGGA AGACCTCCCA CAGTACAAAG AGACCCACGA ACAGTCGAAG 2820 

GTTGAAGGCG CGGAAATCGC AAAGCAGATT GGAAAGAAGA CGGGTTACAA CATCTCAGGA 2880 

ACCGACCCAG AAGCAAAGGG TTATCACCGC AAAGTTGCTG TCACACAAAC GTCAAAAGAC 2940 

AAGACGCGTA CTGTCACTAT TGAGCCAAAA CACAATGGAT ACGACCCTTC CAAAGAGGTG 3000 

GGTGATTATT ATACCATCAT TCTTTGGTAC GCACCAGGTT TCGATGGCAG CATCGTCGAT 3060 

GTGAGCAAGA CGACTGTGAA TGTTGAGGGT GGGGTGGAGC ACCAAGTTTA TAAGAACTCC 3120 

GATTTACATA CGGTTGTTAT CGACGTGAAG GAGGTGATCG GTACCACAAA GAGCGTCAAG 3180 
ATCACATGTA CTGCCGCTTA A 

(2) INFORMATION FOR SEQ ID NO: 4- 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3213 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
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ATGGCAGGAT TATCCGACCC TCTCAATTTC TGCAAAGCAG AGGACTACTA CGCTGCTGCC 60 

AAAGGCTGGA GTGGCCCTCA GAAGATCATT CGCTATGACC AGACCCCTCC TCAGGGTACA 120 

AAAGATCCGA AAAGCTGGCA TGCGGTAAAC CTTCCTTTCG ATGACGGGAC TATGTGTGTA 180 

GTGCAATTCG TCAGACCCTG TGTTTGGAGG GTTAGATATG ACCCCAGTGT CAAGACTTCT 240 

GATGAGTACG GCGATGAGAA TACGAGGACT ATTGTACAAG ACTACATGAC TACTCTGGTT 300 

GGAAACTTGG ACATTTTCAG AGGTCTTACG TGGGTTTCTA CGTTGGAGGA TTCGGGCGAG 360 

TACTACACCT TCAAGTCCGA AGTCACTGCC GTGGACGAAA CCGAACGGAC TCGAAACAAG 420 

GTCGGCGACG GCCTCAAGAT TTACCTATGG AAAAATCCCT TTCGCATCCA GGTAGTGCGT 480 

CTCTTGACCC CCCTGGTGGA CCCTTTCCCC ATTCCCAACG TAGCCAATGC CACAGCCCGT 540 

GTGGCCGACA AGGTTGTTTG GCAGACGTCC CCGAAGACGT TCAGGAAAAA CTTGCATCCG 600 

CAGCATAAGA TGTTGAAGGA TACAGTTCTT GATATTATCA AGCCGGGGCA CGGAGAGTAT 660 

GTGGGTTGGG GAGAGATGGG AGGCATCGAG TTTATGAAGG AGCCAACATT CATGAATTAT 720 

TTCAACTTTG ACAATATGCA ATATCAGCAG GTCTATGCAC AAGGCGCTCT TGATAGTCGT 780 

GAGCCGTTGT ATCACTCTGA TCCCTTCTAT CTCGACGTGA ACTCCAACCC AGAGCACAAG 840 

AACATTACGG CAACCTTTAT CGATAACTAC TCTCAGATTG CCATCGACTT TGGGAAGACC 900 

AACTCAGGCT ACATCAAGCT GGGTACCAGG TATGGCGGTA TCGATTGTTA CGGTATCAGC 960 

GCGGATACGG TCCCGGAGAT TGTGCGACTT TATACTGGAC TTGTTGGGCG TTCGAAGTTG 1020 

AAGCCCAGGT ATATTCTCGG AGCCCACCAA GCTTGTTATG GATACCAGCA GGAAAGTGAC 1080 

TTGCATGCTG TTGTTCAGCA GTACCGTGAC ACCAAGTTTC CGCTTGATGG GTTGCATGTC 1140 

GATGTCGACT TTCAGGACAA TTTCAGAACG TTTACCACTA ACCCGATTAC GTTCCCTAAT 1200 

CCCAAAGAAA TGTTTACCAA TCTAAGGAAC AATGGAATCA AGTGTTCCAC CAACATCACC 1260 

CCTGTTATCA GTATCAGAGA TCGCCCGAAT GGGTACAGTA CCCTCAATGA GGGATATGAT 1320 

AAAAAGTACT TCATCATGGA TGACAGATAT ACCGAGGGGA CAAGTGGGGA CCCGCAAAAT 1380 

GTTCGATACT CTTTTTACGG CGGTGGGAAC CCGGTTGAGG TTAACCCTAA TGATGTTTGG 1440 

GCTCGGCCAG ACTTTGGAGA CAATTATGAC TTCCCTACGA ACTTCAACTG CAAAGACTAC 1500 
CCCTATCATG GTGGTGTGAG TTACGGATAT GGGAATGGCA CTCCAGGTTA CTACCCTGAC 
CTTAACAGAG AGGAGGTTCG TATCTGGTGG GGATTGCAGT ACGAGTATCT CTTCAATATG 
GGACTAGAGT TTGTATGGCA AGATATGACA ACCCCAGCGA TCCATTCATC ATATGGAGAC 
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ATGAAAGGGT TGCCCACCCG TCTGCTCGTC ACCGCCGACT CAGTTACCAA TGCCTCTGAG 1740 

AAAAAGCTCG CAATTGAAAG TTGGG.CTCTT TACTCCTACA ACCTCCATAA AGCAACCTTC 1800 

CACGGTCTTG GTCGTCTTGA GTCTCGTAAG AACAAACGTA ACTTCATCCT CGGACGTGGT 1860 

AGTTACGCCG GTGCCTATCG TTTTGCTGGT CTCTGGACTG GAGATAACGC AAGTACGTGG 1920 

GAATTCTGGA AGATTTCGGT CTCCCAAGTT CTTTCTCTAG GTCTCAATGG TGTGTGTATA 1980 

GCGGGGTCTG ATACGGGTGG TTTTGAGCCC GCACGTACTG AGATTGGGGA GGAGAAATAT 2040 

TGCAGTCCGG AGCTACTCAT CAGGTGGTAT ACTGGATCAT TCCTTTTGCC ATGGCTTAGA 2100 

AACCACTACG TCAAGAAGGA CAGGAAATGG TTCCAGGAAC CATACGCGTA CCCCAAGCAT 2160 

CTTGAAACCC ATCCAGAGCT CGCAGATCAA GCATGGCTTT ACAAATCTGT TCTAGAAATT 2220 

TGCAGATACT GGGTAGAGCT AAGATATTCC CTCATCCAGC TCCTTTACGA CTGCATGTTC 2280 

CAAAACGTGG TCGATGGTAT GCCACTTGCC AGATCTATGC TCTTGACCGA TACTGAGGAT 2340 

ACGACCTTCT TCAATGAGAG CCAAAAGTTC CTCGATAACC AATATATGGC TGGTGACGAC 2400 

ATCCTTGTAG CACCCATCCT CCACAGCCGT AACGAGGTTC CGGGAGAGAA CAGAGATGTC 2460 

TATCTCCCTC TATTCCACAC CTGGTACCCC TCAAACTTGA GACCGTGGGA CGATCAGGGA 2520 

GTCGCTTTAG GGAATCCTGT CGAAGGTGGC AGCGTTATCA ACTACACTGC CAGGATTGTT 2580 

GCCCCAGAGG ATTATAATCT CTTCCACAAC GTGGTGCCGG TCTACATCAG AGAGGGTGCC 2640 

ATCATTCCGC AAATTCAGGT ACGCCAGTGG ATTGGCGAAG GAGGGCCTAA TCCCATCAAG 2700 

TTCAATATCT ACCCTGGAAA GGACAAGGAG TATGTGACGT ACCTTGATGA TGGTGTTAGC 2760 

CGCGATAGTG CACCAGATGA CCTCCCGCAG TACCGCGAGG CCTATGAGCA AGCGAAGGTC 2820 

GAAGGCAAAG ACGTCCAGAA GCAACTTGCG GTCATTCAAG GGAATAAGAC TAATGACTTC 2880 

TCCGCCTCCG GGATTGATAA GGAGGCAAAG GGTTATCACC GCAAAGTTTC TATCAAACAG 2940 

GAGTCAAAAG ACAAGACCCG TACTGTCACC ATTGAGCCAA AACACAACGG ATACGACCCC 3000 

TCTAAGGAAG TTGGTAATTA TTATACCATC ATTCTTTGGT ACGCACCGGG CTTTGACGGC 3060 

AGCATGGTCG ATGTGAGCCA GGCGACCGTG AACATCGAGG GCGGGGTGGA ATGCGAAATT 3120 

TTCAAGAACA CCGGCTTGCA TACGGTTGTA GTCAACGTGA AAGAGGTGAT CGGTACCACA 3180 

AAGTCCGTCA AGATCACTTG CACTACCGCT TAG « n 
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(2) INFORMATION FOR SEQ ID NO: 5- 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Lys Asn Leu His Pro Gin His Lys Met Leu Lys Asp Thr Val Leu Asp 
1 5 10 15 

He Val Lys Pro Gly His Gly Glu Tyr Val Gly Trp Gly Glu Met Gly 
20 25 30 

Gly He Gin Phe Met Lys Glu Pro Thr Phe Met Asn Tyr Phe Asn Phe 
35 40 45 

Asp Asn Met Gin Tyr Gin Gin Val Tyr Ala Gin Gly Ala Leu Asp Ser 

55 60 

Arg Glu Pro Leu Tyr His Ser Asp Pro Phe Tyr 
65 70 7 \ 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

\ui SSfSK TYPE: DNA "<> 

(A) NAME /KEY: misc difference 

(B) LOCATION: replace(3, "") 

(ix) FEATUR?"* INF0RMATI0N: /standard_name= "N is G or A" 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(6, •") 

(ix) FEATURE"" INF0RMATI0N: /note - " N is c °r T" 

(A) NAME/KEY: misc difference 

(B) LOCATION: repllce(3, — ) 

/•x «J?J..£I HER INFORMATION: /note- "N is G or A" 
(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace (9, "") 

/■ \ rSi„SI HER INF 0RMATI0N: /note- "N is G or A" 
(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: repllce(15, "") 

(ix) FEATURE"" INF0RMATI0N: / note = "N is G or A or T or C» 

(A) NAME/KEY: misc difference 

(B) LOCATION: repllce(18, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 
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(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: repllce(21, ,,n ) 

(D) OTHER INFORMATION: /note- "N is C or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CANCANAANA TGCTNAANGA NAC 

(2) INFORMATION FOR SEQ ID NO: 7: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: raise difference 

(B) LOCATION: replace (3, "") 

(D) OTHER INFORMATION: /note- "N is G or A" 
(ix) FEATURE: . M 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(6, ■■) 

(D) OTHER INFORMATION: /note- "N is C or T" 
(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace (9, "") 

(D) OTHER INFORMATION: /note- "N is G or A" 
(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(15, •") 

(D) OTHER INFORMATION: /note= »N is G or A" 
(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(18, "") 

/• x 4?!„2I HER INFORMATION: /note= "N is G or A" 
(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(21, "") 

. (D) OTHER INFORMATION: /note- "N is C or T" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CANCANAANA TGTTNAANGA NAC 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(3, -") 
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(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(6, "«) 

(D) OTHER INFORMATION: /note= "N is G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(9, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(12, "") 

(D) OTHER INFORMATION: /note- "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(15, "■) 

(D) OTHER INFORMATION: /note- "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace (18, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TANAANGGNT CNCTNTGNTA 

20 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace (3, "•) 

(D) OTHER INFORMATION: /note- "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(6, "") 

(D) OTHER INFORMATION: /note- "N is G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(9, "") 
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(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc difference 

(B) LOCATION: replace(12, "") 

(D) OTHER INFORMATION: /note- "N is G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: misc_difference 

(B) LOCATION: replace(15, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace(18, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TANAANGGNT CNGANTGNTA 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AAACTGCAGC TGGCGCGCCA TGGCAGGATT TTCTGAT 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRuIc I3bis) 



I A. The indications made below relate to the microorganism referred to in the description 
^=> .line i 



on page 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRulc I3bis) 



A. The indications made below relate to the microorganism referred to in the description 
to .line 3 



on page 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are identified on an additional sheet | | 



Name of depositary institution 
The National Collections of Industrial and Marine Bacteria Limited (NCIMB) 



Address of depositary institution (including postal code and country) 

23 St. Machar Drive 
Aberdeen 
Scotland 
AB2 1RY 

United Kingdom 



Date of deposit 


Accession Number 


C ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet | | 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample. (Rule 28(A) 
EPC). 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCT Rule \3bis) 



A. The indications made below relate to the microorganism referred to in the description 

. line S 



on page 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are identified on an additional sheet j ] 



Name of depositary institution 
The National Collections of Industrial and Marine Bacteria Limited (NCIMB) 



Address ofdeposiury institution (including postal code end country) 

23 St. Machar Drive 
Aberdeen 
Scotland 
AB2 1RY 



Date of deposit 


Accession Number 


C. ADDITIONAL INDICATIONS (leave blank if not cppGcabk) This information is continued on an additional sheet | | 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganxsm will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample. (Rule 28(4) 
EPC) « 
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^m£?i/l)i^I-* ,Cd bC '° W Wi " SubmiUed lo the In'enwtiorul Bureau Uler (specify the general nature of the indications eg.. 'Accession 



~~ — "™ — ~— — For receiving Office use only — — — — — 

This sheet was received with the tnternationai application 



Authorized officer 

Y Mv;n:::*v.d. Nouw^land 




Form PCT/RO/I34 (July 1992) 



For International Bureau use only 



1 1 This sheet was received by the International Bureau on: 



Authorized officer 



WO 95/10617 



PC77EP94/03398 



36 

CLAIMS 

1. A method of preparing the enzyme a-l,4-glucan lyase comprising isolating the 
enzyme from a culture of a fungus wherein the culture is substantially free of any 
other organism. 

2. A method according to claim 1 wherein the enzyme is isolated and/or further 
purified using a gel that is not degraded by the enzyme. 

3. A method according to claim 2 wherein the gel is based on dextrin or derivatives 
thereof, preferably a cyclodextrin, more preferably beta-cyclodextrin. 

4. A method according to any one of claims 1 to 3 wherein the fungus is Morchella 
costata or Morchella vulgaris. 

5. A GL enzyme prepared by the method according to any one of claims 1 to 4. 

6. An enzyme comprising the amino acid sequence SEQ. ID. No. 1 or SEQ. I.D. No. 
2, or any variant thereof. 

7. A nucleotide sequence capable of coding for the enzyme o-l,4-glucan lyase. 

8. A nucleotide sequence according to claim 7 wherein the sequence is a DNA 
sequence. 

9. A nucleotide sequence according to claim 8 wherein the DNA sequence comprises 
a sequence that is the same as, or is complementary to, or has substantial homology 
with, or contains any suitable codon substitutions for any of those of, SEQ. ID. No. 
3 or SEQ. ID. No. 4. 

10. A method of preparing the enzyme a-l,4-glucan lyase comprising expressing the 
nucleotide sequence of claim 9. 
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11. The use of beta-cyclodextrin to purify an enzyme, preferably GL. 

12. A nucleotide sequence wherein the DNA sequence is made up of at least a 
sequence that is the same as, or is complementary to, or has substantial homology 
with, or contains any suitable codon substitutions for any of those of, SEQ. ID. No. 
3 or SEQ. ID. No. 4. 
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FIGURE 4 

10 20 30 

■ t i 

1 AGACAGGTGC GTTTTTGTTT ATTCTATTCT 
61 ATTGTACAAA TATTTCTAAT TACAGTTGTA 

121 TCATTGATGC ACAAAGATGA TAACGCCTGA 

181 GCGACCTCTC TTTGGCTAGC ATTACCTGAT 

241 TGAGGAATGA AGTCAGCATC GATAGCTCGG 

301 CCCAGTTTTA ATCTCGAATC CTATATAATG 

361 CCTCCATCAC TCCAGCTCAG TCATCCCTCA 

421 AACATCTTGT CCAATCTTTT TTTGAGCTAG 

481 TTCTGATCCT CTCAACTTTT GCAAAGCAGA 

541 GGGCCCTCAA AAAATCATTG GAGTAGACAC 

601 AAACTGGCAT GGAGTGAACT TGAGATTCGA 

661 TAGGCCGTGC GTTTGGAGGG TTAGATACGA 

721 TGATGAGAAT AC GTGAGTTA CCCCATATGT CATTATTGGT AGCGAAAAAC ATATGCTAAT 

781 CAACTAACGA GGCATATAG G AGGACAATTG TGCAAGATTA TATGAGTACT CTGAGTAATA 

841 AATTGGATAC TTATAGAGGT CTTACGTGGG AAACCAAGTG TGAGGATTCG GGAGATTTCT 

901 TTACCTTCTC AGTAAGTGCC AGTACTGCTA TAGCTCCGCT ATATATATAA CACCACTAAC 

961 TAACTGCCCT AAATARTfT.A AGGTCACCGC CGTTGAAAAA TCCGAGCGGA CCCGCAACAA 
1021 GGTCGGCGAT GGCCTCAGAA TTCACCTATG GAAAAGCCCT TTCCGCATCC AAGTAGTGCG 
1081 CACCTTGACC CCTTTGAAGG ATCCTTACCC CATTCCAAAT GTAGCCGCAG CCGAAGCCCG 
1141 TGTGTCCGAC AAGGTCGTTT GGCAAACGTC TCCCAAGACA TTCAGAAAGA ACCTGCATCC 
1201 GCAACACAAG ATGCTAAAGG ATACAGTTCT TGACATTGTC AAACCTGGAC ATGGCGAGTA 
1261 TGTGGGGTGG GGAGAGATGG GAGGTATCCA GTTTATGAAG GAGCCAACAT TCATGAACTA 
1321 TTTTA GTAAG CCCCGAAGAG GTTCCTTATA AATTCTTGGT GGTCATTTTT ACTAACCCAG 
1381 TGTAGACTTC GACAATATGC AATACCAGCA AGTCTATGCC CAAGGTGCTC TCGATTCTCG 
1441 CGAGCCACT G TAAGTACCGT CCTGTGGCAC GACTTAACCC AATAACTAAT CTTTCAACAA 
1501 GGTACCACTC GGATCCCTTC TATCTTGATG TGAACTCCAA CCCGGAGCAC AAGAATATCA 
1561 CGGCAACCTT TATCGATAAC TACTCTCAAA TTGCCATCGA CTTTGGAAAG ACCAACTCAG 



40 

GTGCGGCAGA 
GGTGCAGTTG 
TTAGTACTCA 
TGGTTACAAC 
CCTCATAAAA 
GCCATCGTTC 
ACTTGGCCTC 
ATCTCATTAT 
AGACTACTAC 
TACTCCTCCA 
TGATGGGACT 
CCCTGGTTTC 
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TATGCACTCA 
AAAATCCGGT 
AGGTTTAATT 
TGCAAATACT 
ATTGATTTCA 
CCTCCTCGCC 
CTCTGATATC 
ACCTCCGTCA 
AGTGTTGCGC 
AAGAGCACCA 
TTAGGTGTGG 
AAGACCTCTG 
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i 

CAAGAAACAA 
CGCACAAAGA 
GGGTATGTGT 
GCGGCAGCAA 
ATTTTATATT 
TCTTCATTCT 
TTCCGAACAA 
TGGCAGGATT 
TAGACTGGAA 
AGTTCCCCAA 
TTCAGTTCAT 
ACGAGTATGG 



SUBSTITUTE SHEET {RULE 26) 



WO 95/10617 



0 5/13 



PCT/EP94/03398 



FIGURE 4 CONTINUED 

1621 GCTACATCAA GCTGGGAACC AGGTATGGTG GTATCGATTG TTACGGTATC AGTGCGGATA 
1681 CGGTCCCGGA AATTGTACGA CTTTATACAG GTCTTGTTGG ACGTTCAAAG TTGAAGCCCA 
1741 GATATATTCT CGGGGCCCAT CAAGCC TGTA AGTCCTTCCC CTCATGAGTG ATTTATTAKA 
1801 CTTGCATAAT AAACTAAf TT <* GTTTTCAAA G ETTATfifiAT ACCAACAGGA AAGTGACTTG 
1861 TATTCTGTGG TCCAGCAGTA CCGTGACTGT AAATTTCCAC TTGACGGGAT TCACGTCGAT 
1921 GTCGATGTTC AGGTAAATGG CCATGGTATC ATTGAAGCTT TGAGAAATGT TCTAAPTfiTR 
1981 TTTATAACAT TCCTAGGACG GCTTCAGAAC TTTCACCACC AACCCACACA CTTTCCCTAA 
2041 CCCCAAAGAG ATGTTTACTA ACTTGAGGAA TAATGGAATC AAGTGCTCCA CCAATATCAC 
2101 TCCTGTTATC AGCATTAACA ACAGAGAGGG TGGATACAGT ACCCTCCTTG AGGGAGTTGA 
2161 CAAAAAATAC TTTATCATGG ACGACAGATA TACCGAGGGA ACAAGTGGGA ATGCGAAGGA 
2221 TGTTCGGTAC ATGTACTACG GTGGTGGTAA TAAGGTTGAG GTCGATCCTA ATGATGTTAA 
2281 TGGTCGGCCA GACTTTAAAG ACAACT AGTA AGTTGTTTAT TTGACTACGA TAGGTAArrr 
2341 GTAAGCGGCA TTAACATATT TfiTAfiTKAfT TCCCCGCGAA CTTCAACAGC AAACAATACC 
2401 CCTATCATGG TGGTGTGAGC TACGGTTATG GGAACGGTAG TGTAAGTGAC GATATfTPAr 
2461 CAACATAATG AAATTTATAA QGACTAACTA RACACAAAAA TTTGTAC firA GGTTTTTACC 
2521 CGGACCTCAA CAGAAAGGAG GTTCGTATCT GGTGGGGAAT GCAGTACAAG TATCTCTTCG 
2581 ATATGGGACT GGAATTTGTG TGGCAAGACA TGACTACCCC AGCAATCCAC ACATCATATG 
2641 GAGACATGAA AGGGTTGCCC ACCCGTCTAC TCGTCACCTC AGACTCCGTC ACCAATGCCT 
2701 CTGAGAAAAA GCTCGCAATT GAAACTTGGG CTCTCTACTC CTACAATCTC CACAAAGCAA 
2761 CTTGGCATGG TCTTAGTCGT CTCGAATCTC GTAAGAACAA ACGAAACTTC ATCCTCGGGC 
2821 GTGGAAGTTA TGCCGGAGCC TATCGTTTTG CTGGTCTCTG GACTGGGGAT AATGCAAGTA 
2881 ACTGGGAATT CTGGAAGATA TCGGTCTCTC AAGTTCTTTC TCTGGGCCTC AATGGTGTGT 
2941 GCATCGCGGG GTCTGATACG GGTGGTTTTG AACCCTACCG TGATGCAAAT GGGGTCGAGG 
3001 AGAAATACTG TAGCCCAGAG C TACTCATCA GGTGGTATAC TGGTTCATTC CTCTTGCCGT 
3061 GGCTCAGGAA CCATTATGTC AAAAAGGACA GGAAATGGTT CCAG GTAATC TATCCTTTfT 
3121 TATCTTTGAA GCATTGAAGA TACTAAGATA TAATTTAft ftA ACCATACTCG TACCCCAAGC 

III] tItJII^ S?!ISK gaa ctcgcagacc aagcatggct CTATAAATCC GTTTTGGAGA 
3241 TCTGTAGGTA CTATGTGGAG CTTAGATACT CCCTCATCCA ACTACTTTAC GACTGCATGT 
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3301 TTCAAAACGT AGTCGACGGT ATGCCAATCA CCAGATCTAT G GTATGTATT CTACCCTAGG 
3361 CTTCCAGAGC AACATATGCT AACCAATTGA ACCTGGGTTT CTAG CTCTTG ACCGATACTG 
3421 AGGATACCAC CTTCTTCAAC GAGAGCCAAA AGTTCCTCGA CAACCAATAT ATGGCTGGTG 
3481 ACGACATTCT TGTTGCACCC ATCCTCCACA GTCGCAAAGA AATTCCAGGC GAAAACAGAG 
3541 ATGTCTATCT CCCTCTTTAC CACACCTGGT ACCCCTCAAA TTTGAGACCA TGGGACGATC 
3601 AAGGAGTCGC TTTGGGGAAT CCTGTCGAAG GTGGTAGTGT CATCAATTAT ACTGCTAGGA 
3661 TTGTTGCACC CGAGGATTAT AATCTCTTCC ACAGCGTGGT ACCAGTCTAC GTTAGAGAGG 
3721 GTAAGCAGTA AAATAATCTC TTCCCAGTTT CAAATACATT TAGCTAGTAG CTAACGCTAT 
3781 GAACCTACAG GTGCCATCAT CCCGCAAATC GAAGTACGCC AATGGACTGG CCAGGGGGGA 
3841 GCCAACCGCA TCAAGTTCAA CATCTACCCT GGAAAGGATA AG GTAAAATT CAATGATCAC 
3901 CCTGCATCTA TTCCATCGCT GGTTTTCTTT ACCCTTACTG ACTTCATTCC TCAAAATAHA 
3961 GGAGTACTGT ACCTATCTTG ATGATGGTGT TAGCCGTGAT AGTGCGCCGG AAGACCTCCC 
4021 ACAGTACAAA GAGACCCACG AACAGTCGAA GGTTGAAGGC GCGGAAATCG CAAAGCAGAT 
4081 TGGAAAGAAG ACGGGTTACA ACATCTCAGG AACCGACCCA GAAGCAAAGG GTTATCACCG 
4141 CAAAGTTGCT GTCACACA AG TAATACCGCC CTTGACTTGT ATCACTTCCT GACATCATfif! 
420 1 TAATATTTCT CTGTTTACCT CAAAfiAffiTr. AAAAGACAAG ACGCGTACTG TCACTATTGA 
4261 GCCAAAACAC AATGGATACG ACCCTTCCAA AGAGGTGGGT GATTATTATA CCATCATTCT 
4321 TTGGTACGCA CCAGGTTTCG ATGGCAGCAT CGTCGATGTG AGCAAGACGA CTGTGAATGT 
4381 TGAGGGTGGG GTGGAGCACC AAGTTTATAA GAACTCCGAT TTACATACGG TTGTTATCGA 
4441 CGTGAAGGAG GTGATCGGTA CCACAAAGAG CGTCAAGATC ACATGTACTG CCGCJTAA&G 
4501 TCTTTTCTTG GGGGCGGGAG GCGAGACCTT CGAAATGTAT ACGGGAGTGG TAACTCCGGG 
4561 AAAATGGTGA TATGGGGGAT CAAGTTGGAG GGGAATCTGT TTATTTCTTT ATTTCTTTAT 
4621 TTACTGGATT GGAAAATAGG GAGCACAGTT CTGACTGGAT TGGTTTGATT GTTGGCCTCT 
4681 ACGGGTTCTC TTTACTTTGT CTGGAAATCC AATTTATTGT TATGCG 
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FIGURE 5 

*? 2 ? 30 40 50 60 

1 ATGCAGGCAA CGACAGGCGT TTTTTGTTTT ATCCGCAGAG GTGCAGCAG^ AGGAAACAAA 
61 CCATACAAAC ATTCCTTGAC GCGGTTTTAG GTGCAGTTAA GGCCCGGGCG CACCAAGAAC 
121 ATTGATGTAC TTGGTCTAAA AAAGATCATA ATACCCGATT AGTGTTCATG GTTTGATTGG 
181 GTCTAAGTAC AAGTTTTACA GAGTTCAGCT TAGTTCATTG TTCGAAACTA CCAATATCAC 
241 ACCTATGCCT GCTGGCATTG ATAGCTCGGC TTGTGAAAGC TGATTACAAT CTTACATTTC 
301 TGATTTAATA TCGGACTGAT CTATATATAA GGGTCATCAT TTCCTCTCCG CCTTTTGGTT 
361 CTCTTTCATC ACCCCAGCCC AATCATCACC GTTGGCCTTT ACTTCTCTCT TCCGTTGATA 
421 TTTTCTCGAC AAAACATCTT GTCCACTGTT AGGCTAGCTC CCAGAATTAT CCCTCCAACA 
481 TGGCAGGATT ATCCGACCCT CTCAATTTCT GCAAAGCAGA GGACTACTAC GCTGCTGCCA 
541 AAGGCTGGAG TGGCCCTCAG AAGATCATTC GCTATGACCA GACCCCTCCT CAGGGTACAA 
601 AAGATCCGAA AAGCTGGCAT GCGGTAAACC TTCCTTTCGA TGACGGGACT ATGTGTGTAG 
661 TGCAATTCGT CAGACCCTGT GTTTGGAGGG TTAGATATGA CCCCAGTGTC AAGACTTCTG 
721 ATGAGTACGG CGATGAGAAT ACGTGGGTCG rrrACTCAAT TAArTATcrr cr^^ 
781 ATGGAAAGCT TCTfirTAACC GATTAATfiAfi GrftTfiTAC GA GGACTATTGT ACAAGACTAC 
841 ATGACTACTC TGGTTGGAAA CTTGGACATT TTCAGAGGTC TTACGTGGGT TTCTACGTTG 
901 GAGGATTCGG GCGAGTACTA CACCTTCAAG GCAAGCCTf A arajr^ c tp.aat.t.t 
961 . TATATATCAC AAPAAACTAA CTAGTrATAf rtG TCCGAAGT CACTGCCGTG GACGAAACCG 
.1021 AACGGACTCG AAACAAGGTC GGCGACGGCC TCAAGATTTA CCTATGGAAA AATCCCTTTC 
1081 GCATCCAGGT AGTGCGTCTC TTGACCCCCC TGGTGGACCC TTTCCCCATT CCCAACGTAG 
1141 CCAATGCCAC AGCCCGTGTG GCCGACAAGG TTGTTTGGCA GACGTCCCCG AAGACGTTCA 
1201 GGAAAAACTT GCATCCGCAG CATAAGATGT TGAAGGATAC AGTTCTTGAT ATTATCAAGC 
1261 CGGGGCACGG AGAGTATGTG GGTTGGGGAG AGATGGGAGG CATCGAGTTT ATGAAGGAGC 
1321 CAACATTCAT GAATTATTTC AGTAAGCTC.T tb^ttt rrTATrTrrx r.,rrr^ TJ 
1381 TTTGCTAAGG AAAfTGTAG A CTTTGACAAT ATGCAATATC AGCAGGTCTA TGCACAAGGC 
1441 GCTCTTGATA GTCGTGAGCC GTTGTAAGTA AffSTrrjcTG AfATnTrAT, *rr*r^ 
1501 CTGATCGTTC AATAAG GTAT CACTCTGATC CCTTCTATCT CGACGTGAAC TCCAACCCAG 
1561 AGCACAAGAA CATTACGGCA ACCTTTATCG ATAACTACTC TCAGATTGCC ATCGACTTTG 
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1621 GGAAGACCAA CTCAGGCTAC ATCAAGCTGG GTACCAGGTA TGGCGGTATC GATTGTTACG 

1681 GTATCAGCGC GGATACGGTC CCGGAGATTG TGCGACTTTA TACTGGACTT GTTGGGCGTT 

1741 CGAAGTTGAA GCCCAGGTAT ATTCTCGGAG CCCACCAAGC TT GTAAGCCC GCCCCCTTTA 

1801 CGATGCATTT ATTAGGGGTC CACAGACTAA ACTTGTTCCA AAG GTTATGG ATACCAGCAG 

1861 GAAAGTGACT TGCATGCTGT TGTTCAGCAG TACCGTGACA CCAAGTTTCC GCTTGATGGG 

1921 TTGCATGTCG ATGTCGACTT TCAG GTAAAT GGCCCAGGTA TCGTTGAA6C TTTGGAGAAT 

1981 GCTAATTGTG CTCGTAAAAC TTTAAG GACA ATTTCAGAAC GTTTACCACT AACCCGATTA 

2041 CGTTCCCTAA TCCCAAAGAA ATGTTTACCA ATCTAAGGAA CAATGGAATC AAGTGTTCCA 

2101 CCAACATCAC CCCTGTTATC AGTATCAGAG ATCGCCCGAA TGGGTACAGT ACCCTCAATG 

2161 AGGGATATGA TAAAAAGTAC TTCATCATGG ATGACAGATA TACCGAGGGG ACAAGTGGGG 

2221 ACCCGCAAAA TGTTCGATAC TCTTTTTACG GCGGTGGGAA CCCGGTTGAG GTTAACCCTA 

2281 ATGATGTTTG GGCTCGGCCA GACTTTGGAG ACAATT AGTA AGTTACTCAA TAGGCTACTT 

2341 GAGATATTCT GTAGGTGGCA TTAACACGAC TATART GAf.T TCCCTACGAA CTTCAACTGC 

2401 AAAGACTACC CCTATCATGG TGGTGTGAGT TACGGATATG GGAATGGCAC T GTAAGTGAT 

2461 AATAAGTCAT AAATACAAGG TAATTCATGG AGACTAATCA GTGGTAAATG AATTTTAG CC 

2521 AGGTTACTAC CCTGACCTTA ACAGAGAGGA GGTTCGTATC TGGTGGGGAT TGCAGTACGA 

2581 GTATCTCTTC AATATGGGAC TAGAGTTTGT ATGGCAAGAT ATGACAACCC CAGCGATCCA 

2641 TTCATCATAT GGAGACATGA AAGGGTTGCC CACCCGTCTG CTCGTCACCG CCGACTCAGT 

2701 TACCAATGCC TCTGAGAAAA AGCTCGCAAT TGAAAGTTGG GCTCTTTACT CCTACAACCT 

2761 CCATAAAGCA ACCTTCCACG GTCTTGGTCG TCTTGAGTCT CGTAAGAACA AACGTAACTT 

2821 CATCCTCGGA CGTGGTAGTT ACGCCGGTGC CTATCGTTTT GCTGGTCTCT GGACTGGAGA 

2881 TAACGCAAGT ACGTGGGAAT TCTGGAAGAT TTCGGTCTCC CAAGTTCTTT CTCTAGGTCT 

2941 CAATGGTGTG TGTATAGCGG GGTCTGATAC GGGTGGTTTT GAGCCCGCAC GTACTGAGAT 

3001 TGGGGAGGAG AAATATTGCA GTCCGGAGCT ACTCATCAGG TGGTATACTG GATCATTCCT 

3061 TTTGCCATGG CTTAGAAACC ACTACGTCAA GAAGGACAGG AAATGGTTCC AG GTAATATA 

3121 CTCTTTCTGG TCTCTGAGTA TCGAAGACGC TAAGACAATA TAG GAACCAT ACGCGTACCC 

3181 CAAGCATCTT GAAACCCATC CAGAGCTCGC AGATCAAGCA TGGCTTTACA AATCTCTTCT 

3241 AGAAATTTGC AGATACTGGG TAGAGCTAAG ATATTCCCTC ATCCAGCTCC TTTACGACTG 
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FIGURE 5 CONTINUED 

3301 CATGTTCCAA AACGTGGT-CG ATGGTATGCC ACTTGCCAGA TCTATG GTAT GCATTTT^ Tr 
3361 . CgTCTCCTTT CACGATAATG HArrACTrr ^ ACHGAATTTT nTnnn nr TTGACCGATA 
3421 CTGAGGATAC GACCTTCTTC AATGAGAGCC AAAAGTTCCT CGATAACCAA TATATGGCTG 
3481 GTGACGACAT CCTTGTAGCA CCCATCCTCC ACAGCCGTAA CGAGGTTCCG GGAGAGAACA 
3541 GAGATGTCTA TCTCCCTCTA TTCCACACCT GGTACCCCTC AAACTTGAGA CCGTGGGACG 
3601 ATCAGGGAGT CGCTTTAGGG AATCCTGTCG AAGGTGGCAG CGTTATCAAC TACACTGCCA 
3661 GGATTGTTGC CCCAGAGGAT TATMTCTCT TCCACAACGT GGTGCCGGTC TACATCAGAG 
3721 AGGGTAAGCG ATMAATMT TTrTTITMr TTrrAK.Tir ^y,^ r ~. fnrrTTn 
3781 ASCCAGGTGC CATCATTCCG CAAATTCAGG TACGCCAGTG GATTGGCGAA GGAGGGCCTA 
3841 ATCCCATCAA GTTCAATATC TACCCTGGAA AGGACAAG GT ATATTfTrrfl 

3901 GCATTTATTC TTTrTCTurT IITIMTMrT TrATrTc^T .t tit nrnfCTOX 

3961 TTGATGATGG TGTTAGCCGC GATAGTGCAC CAGATGACCT CCCGCAGTAC CGCGAGGCCT 
4021 ATGAGCAAGC GAAGGTCGAA GGCAAAGACG TCCAGAAGCA ACTTGCGGTC ATTCAAGGGA 
4081 ATAAGACTAA TGACTTCTCC GCCTCCGGGA TTGATAAGGA GGCAAAGGGT TATCACCGCA 
4141 AAGTTTCTAT CAAACAGGTA CATfiflTTTra ttttt^ ^■.. <eTr , m „„^ . 
«« Atcctaacat TGrTTrTrTT mnmrt AGTCAAAAGA CAAGACCCGT ACTGTCACCA 
4261 TTGAGCCAAA ACACAACGGA TACGACCCCT CTAAGGAAGT TGGTAATTAT TATACCATCA 
4321 TTCTTTGGTA CGCACCGGGC TTTGACGGCA GCATCGTCGA TGTGAGCCAG GCGACCGTGA 
4381 ACATCGAGGG CGGGGTGGAA TGCGAAATTT TCAAGAACAC CGGCTTGCAT ACGGTTGTAG " 
4441 TCAACGTGAA AGAGGTGATC GGTACCACAA AGTCCGTCAA GATCACTTGC ACTACCGCTT 
4501 *GAGCTCTTT TATGAGGGGT ATATGGGAGT GGCAGCTCAG AAATTTGGGA AGCTTCTGGG 
4561 TATTCCTTTT GTTTATTTAC TTATTTATTG AATCGACCAA TACGGGTGGG ATTCTCTCTG 
4621 GTTTTTGTGA GGCTATGTTT TACTTG6TCT GAAAATCAAA TTCGTTCTCA 
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FIGURE 6 
MC 


- 


MAGFSDPLNFCKAEDYYSVALDWKGPQKI IGVDTTPPKSTKFPKNWHGVN 

• •••«••••*•••• • ■ •••••• ■ *• • • • • * • 


-50 


MV 


- 


MAGLSDPLNFCKAEDYYAAAKGWSGPQKIIRYDQTPPQGTKDPKSWHAVN 


-50 


MC 


- 


LRFDDGTLGVVQFIRPCVWRVRYDPGFKTSDEYGDENTRTIVQDYMSTLS 


-100 


MV 


- 


LPFDDGTMCVVQFVRPCVWRVRYDPSVKTSDEYGDENTRTIVQDYMTTLV 


-100 


MC 


- 


NKLDTYRGLTWETKCEDSGDFFTFSSKVTAVEKSERTRNKVGDGLRIHLW 


-150 


MV 


- 


GNLDIFRGLTWVSTLEDSGEYYTFKSEVTAVDETERTRNKVGDGLKIYLW 


-150 


MC 


- 


KSPFRIQVVRTLTPLKDPYPIPNVAAAEARVSDKVVWQTSPKTFRKNLHP 


-200 


MV 


- 


KNPFRIQVVRLLTPLVDPFPIPNVANATARVADKVVWQTSPKTFRKNLHP 


-200 


MC 


- 


QHKMLKDTVLDIVKPGHGEYVGWGEMGGIQFMKEPTFMNYFNFDNMQYQQ 


-250 


MV 


- 


QHKMLKDTVLDIIKPGHGEYVGWGEMGGIEFMKEPTFMNYFNFDNMQYQQ 


-250 


MC 


- 


VYAQGALOSREPLYHSDPFYLOVNSNPEHKNITATFIDNYSQIAIOFGKT 

*"*****••••••••*•••••••••*••••••••*•*•••••••**•••• 


-300 


MV 


- 


*"****••••«••••*•••••«•••••«•••••••••«•••••••«*••• 

VYAQGALDSREPLYHSDPFYLDVNSNPEHKNITATFIDNYSQIAIDFGKT 


-300 


MC 


- 


NSGYIKLGTRYGGIDCYGISADTVPEIVRLYTGLVGRSKLKPRYILGAHQ 


-350 


MV 


- 


NSGY I KLGTRYGG IDCYG I SADTVPE I VRL YTG LVGRSKLKPRY I LGAHQ 


-350 


MC 


- 


ACYGYQQESDLYSVVQQYRDCKFPLDGIHVDVDVQDGFRTFTTNPHTFPN 

• •••••••••• • •••••• ••••• • • •••••••• 


-400 


MV 


- 


*•••"•*••** •••••••• «••••••••••• • • 

ACYGYQQESDLHAVVQQYRDTKFPLDGLHVDVDFQDNFRTFTTNPITFPN 


-400 


MC 


- 


PKEMFTNLRNNG IKCSTNITPV I S INNREGGYSTLLEGVDKKYF IMDDRY 


-450 


MV 


- 


PKEMFTNLRNNG IKCSTNITPV I SIRDRPNGYSTLNEGYDKKYF IMDDRY 


-450 


MC 


- 


TEGTSGNAKDVRYMYYGGGNKVEVDPNDVNGRPDFKDNYDFPANFNSKQY 


-500 


MV 


- 


TEGTSGDPQNVRYSFYGGGNPVEVNPNDVWARPDFGDNYDFPTNFNCKDY 


-500 


MC 


- 


PYHGGVSYGYGNGSAGFYPDLNRKEVRIWWGMQYKYLFDMGLEFVWQDMT 


-550 


MV 


- 


PYHGGVSYGYGNGTPGYYPDLNREEVRIWWGLQYEYLFNMGLEFVWQDMT 


-550 


MC 


- 


TPAIHTSYGDMKGLPTRLLVTSOSVTNASEKKLAIETWALYSYNLHKATW 


-600 


MV 




TPAIHSSYGDMKGLPTRLLVTADSVTNASEKKLAIESWALYSYNLHKATF 


-600 


MC 




HGLSRLESRKNKRNFILGRGSYAGAYRFAGLWTGDNASNWEFWKISVSQV 


-650 


MV 




HGLGRLESRKNKRNFILGRGSYAGAYRFAGLWTGDNASTWEFWKISVSQV 


-650 


MC 




LSLGLNGVCIAGSDTGGFEPYRDANGVEEKYCSPELLIRWYTGSFLLPWL 


-700 


MV 




LSLGLNGVCIAGSDTGGFEPAR-TEIGEEKYCSPELLIRWYTGSFLLPWL 


-699 
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FIGURE 7 



MAGFSDPLNF CK AEDYYSVA LDWK GPQKI I GVDTTPPKST KFPKNWHGVN LRFDDGTLGV VQFIRPCVWR 
VRYDPGFKTS OEYGDENTRT IVQDYMSTLS NKLDTYRGLT WETKCEDSGD FFTFSSKVTA VEKSERTRNK 
VGDGLRIHLW KSPFRIOVVR TLTPLKDPYP IPNVAAAEAR VSDKVVWOTS PKTFRKNLHP OHKMLKDTVI 
DIVKPGHGE Y VGWGEMGGIQ FMKEPTFMNY FNFDNMQYQQ VYAQGALDSR EPLYHSDPFY LDVNSNPFHK 
NTTATFIDNY SQIAIDFGKT NSGYIK LGTR YGGIDCYGIS ADTVPEIVRL YTGLVGRSK L KPRYTI fiflfin 
ACYGYQOESD LYSVVOOYRD CKFPI DGTHV DVDVQDGFRT FTTNPHTFPN PKEMFTNLRN NGIK CSTNIT 
PVISINNREG GYSTLLEGVD K KYFIMDDRV TEGTSGNAKD VRYMYYGGGN KVEVDPNDVN GRPDFKDNYD 
FPANFNSKQY PYHGGVSYGY GNGSAGFYPD LNRKEVRIWW GMQYKYLFDM GLEFVWQDMT TPAIHTSYGD 
MKGLPTRLLV TSDSVTNASE KKLAIETWAL YSYNLHKATW HGLSRLESRK NKRNFILGRG SYAGAYRFAG 
LWTGDNASNW EFWKISVSQV LSLGLNGVCI AGSDTGGFEP YRDANGVEEK YCSPELLIRW YTGSFLLPWL 
RNHYVKKORK WFQEPYSYPK HLETHPELAD QAWLYKSVLE ICRYYVELRY SLIQLLYDCM FQNVVDGMPI 
TRSMLLTDTE DTTFFNESQK FLDNQYMAGD DILVAPILHS R KEIPGENRD VYLPLYHTWY PSNLRPWDDO 



GVALGNPVEG GSVI NYTA RI VAPEDYNLFH SVVPVYVREG AIIPQTFVR n WTGQGGANRI KFNIYPGKDK 
EYCTYLDOGV SRDSAPEDLP QYKETHEQSK VEGAEIAKQI GKKTGYNISG TDPEAKGYHR KVAVTQTSKD 
KTRTVTIEPK HNGYDPSKE V GDYYTIILWY APGFDGSIVD VSK TTVNVFR GVEHQVYKNS DLHTVVIDVK 
EVIGTTKSVK ITCTAA 
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